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(2) /INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5116 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID.NO:!: 
ACAGCGTTCT CTTAATACTA GTACAAACCC ACAATAAAAT ATGACA7VACA ACAATTACAA 
CACCTTTTTT GCAGTCTATA TGCAAATATT TTAAAAAATA GTATAAATCC GCCATATAAA 
ATGGTATAAT CTTTCATCTT TCATCTTTCA TCTTTCATCT TTCATCTTTC ATCTTTCATC 



60 
120 
180 



TTTCATCTTT 
ACATGCCCTG 
AACGCA-AATG 
TATATCGTCT 
GGGGTTGTGA 
ACTTAGCGTT 
AATCTGTTTT 
AAGTAGATGG 
AATTTAACAT 
TATTCAACCG 
GACAAGTCTT 
CTAATGGCTT 
TCACCTTCGA 
CTGTCGGTAA 
TTAGCGTAAA 
TAATAAACCC 
GCGATATTTT 
GTAAACTTTC 
AAGAGGGTGA 
GCAAGCTGAT 
CAGGTAAAGA 
GCATTCAATT 
AAGAAAAAGG 
ACGCTCAAGG 
ATTTATTCAT 
ATGtATCTAT 
CGGGATCCGG 
ACACAACTCT 
GCATCTATGT 
GTCGGAGCGG 
GTGCAAACTT 
GGGCGCAAGG 
ACCAAGTCAT 
ATAATGTCTC 



CATCTTTCAT 
ATGAACCGAG 
ATAAAGTAAT 
CAAATTCAGC 
CCATTCCACA 
AAAGCCACTT 
AGCAAGCGGC 
TAATAAAACC 
CGACCAAAAT 
TGTTACATCT 
TTTAATCAAC 
TACGGCTTCT 
GCA7UVCCAAA 
AGACGGCAGT 
TGGTGGCAGC 
AACCATTACT 
TGCCAAAGGC 
TGCTGATTCT 
AGCGGAAATT 
GATTACAGGC 
AGGGGGAGAA 
AGCAAAGAAA 
CGGACGCGCT 
TAGTGGTGAT 
C7VAAGACAAT 
TAATGCAGAA 
GAATAGTGCC 
TGAGAGTATA 
CAATAGCTCC 
TGGCGGCGTT 
AACAATTTAC 
TAACATAAAC 
TACAGGTCAA 
TCTAAACGGC 



CTTTCATCTT 
GGAAGGGAGG 
TTAATTGTTC 
AAACGCCTGA 
GAAAAAGGCA 
TCCGCTATGT 
TTACAAGGAA 
ATTATCCGCA 
GAAATGGTGC 
AACCAAATCT 
CCAAATGGTA 
ACGCTAGACA 
GATT^AAGCGC 
GTAAATCTTA 
ATTTCTTTAC 
TACAGCATTG 
GGTAACATTA 
GTAAGCAAAG 
GGCGGTGTAA 
GATAAAGTCA 
ACTTACCTTG 
ACCTCTTTAG 
ATTGTGTGGG 
ATCGCTAAAA 
GCAATTGTTG 
ACAGCAGGAC 
AGCACCCCAA 
CTAAAAAAAG 
ATTAATTTAT 
GAGATTAACA 
TCAGGCGGCT 
ATTACAGCTA 
GGGACTATTA 
ACTGGCAGCG 
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TCATCTTTCA 
GAGGGGCAAG 
AACTAACCTT 
ATGCTTTGGT 
GCGAAAAACC 
TACTATCTTT 
TGGATGTAGT 
ACAGTGTTGA 
AGTTTTTACA 
CCCAATTAAA 
TCACAATAGG 
TTTCTAACGA 
TCGCTGAAAT 
TTGGTGGCAA 
TCGCAGGGCA 
CCGCGCCTGA 
ATGTCCGTGC 
ATAAAAGCGG 
TTTCCGCTCA 
CATTAAAAAC 
GCGGTGACGA 
AAAAAGGCTC 
GCGATATTGC 
CCGGTGGTTT 
ACGCCAAAGA 
GCAGCAATAC 
AACGAAACAA 
GTACCTTTGT 
CCAATGGCAG 
ACGATATTAC 
GGGTTGATGT 
AACAAGATAT 
CCTCAGGCAA 
GACTGCAATT 



TCTTTCATCT 
AATGAAGAGG 
AGGAGAAAAT 
TGCTGTGTCT 
TGCTCGCATG 
AGGTGTAACA 
ACACGGCACA 
CGATATCATT 
AGAAAACAAC 
AGGGATTTTA 
TAAAGACGCA 
AAACATCAAG 
TGTGAATCAC 
AGTGAAAAAC 
AAAAATCACC 
AAATGAAGCG 
TGCCACTATT 
CAATATTGTT 
AAATCAGCAA 
AGGTGCAGTT 
GCGCGGCGAA 
AACCATCAAT 
GTTAATTGAC 
TGTGGAGACG 
GTGGTTGTTA 
TTCAGAAGAC 
AGAAAAGACA 
TAACATCACT 
CTTAACTCTT 
CACCGGTGAT 
TCATAAAAAT 
CGCCTTTGAG 
TCAAAAAGGT 
CACCACTAAA 



TTCATCTTTC 
GAGCTGAACG 
ATGAACAAGC 
GAATTGGCAC 
AAAGTGCGTC 
TCTATTCCAC 
GCCACTATGC 
.AATTGGAAAC 
AACTCCGCCG 
GATTCTAACG 
ATTATTAACA 
GCGCGTAATT 
GGTTTAATTA. 
GAGGGTGTGA 
ATCAGCGATA 
GTCAATCTGG 
CGAAACCAAG 
CTTTCCGCCA 
GCTAAAGGCG 
ATCGACCTTT 
GGtAAAAAGG 
GTATCAGGCA . 
GGCAATATTA 
TCGGGGCATG 
GACCCGGATA 
GATGAATACA 
ACATTAACAA 
GCTAATCAAC 
TGGAGTGAGG 
GATACCAGAG 
ATCTCACTCG 
AAAGGAAGCA 
TTTAGATTTA 
AGAACXAATA 



240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 




31; 



AATACGCTAT CACAAATAAA TTTGAAGGGA CTTTAAATAT TTCAGGGAAA GTGAACATCT 
CAATGGTTTT ACCTAAAAAT GAAAGTGGAT ATGATAAATT CAAAGGACGC ACTTACTGGA 
ATTTAACCTC CTTAAATGTT TCCGAGAGTG GCGAGTTTAA CCTCACTATT GACTCCAGAG 
GAAGCGATAG TGCAGGCACA CTTACCCAGC CTTATAATTT AAACGGTATA TCATTCAACA 
AAGACACTAC CTTTAATGTT GAACGAAATG CAAGAGTCAA CTTTGACATC AAGGCACCAA 
TAGGGATAAA TAAGTATTCT AGTTTGAATT ACGCATCATT TAATGGAAAC ATTTCAGTTT 
CGGGAGGGGG GAGTGTTGAT TTCACACTTC TCGCCTCATC CTCTAACGTC CAAACCCCCG 
GTGTAGTTAT AAATTCTAAA TACTTTAATG TTTCAACAGG GTCAAGTTTA AGATTTAAAA 
CTTCAGGCTC AACAAAAACT GGCTTCTCAA TAGAGAAAGA TTTAACTTTA AATGCCACCG 
GAGGCAACAT AACACTTTTG CAAGTTGAAG GCACCGATGG AATGATTGGT AAAGGCATTG 
TAGCCAAAAA AAACATAACC TTTGAAGGAG GTAACATCAC CTTTGGCTCC AGGAAAGCCG 
TAACAGAAAT CGAAGGCAAT GTTACTATCA ATAACAACGC TAACGTCACT CTTATCGGTT 
CGGATTTTGA CAACCATCAA AAACCTTTAA CTATTAAAAA AGATGTCATC ATTAATAGCG 
GCAACCTTAC CGCTGGAGGC AATATTGTCA ATATAGCCGG AAATCTTACC GTTGAAAGTA 
ACGCTAATTT CAAAGCTATC ACAAATTTCA CTTTTAATGT AGGCGGCTTG TTTGACAACA 
AAGGCAATTC AAATATTTCC ATTGCCAAAG GAGGGGCTCG CTTTAAAGAC ATTGATAATT 
CCAAGAATTT AAGCATCACC ACCAACTCCA GCTCCACTTA CCGCACTATT ATAAGCGGCA 
ATATAACCAA TAAAAACGGT GATTTAAATA TTACGAACGA AGGTAGTGAT ACTGAAATGC 
AAATTGGCGG CGATGTCTCG CAAAAAGAAG GTAATCTCAC GATTTCTTCT GACAAAATCA 3 360 

ATATTACCAA ACAGATAACA ATCAAGGCAG GTGTTGATGG GGAGAATTCC GATTCAGACG 3420 
CGACAAACAA TGCCAATCTA ACCATTAAAA CCAAAGAATT GAAATTAACG CAAGACCTAA 34 80 

ATATTTCAGG TTTCAATAAA GCAGAGATTA CAGCTAAAGA TGGTAGTGAT TTAACTATTG - 3 54 0 
GTAACACCAA TAGTGCTGAT GGTACTAATG CCAAAAAAGT AACCTTTAAC CAGGTTAAAG 3600 
ATTCAAAAAT CTCTGCTGAC GGTCACAAGG TGACACTACA CAGCAAAGTG GAAACATCCG 3660 
GTA<3TAATAA CAACACTGAA GATAGCAGTG ACAATAATGC CGGCTTAACT ATCGATGCAA 
AAipVTGTAAC AGTAAACAAC AATATTACTT CTCACAAAGC AGTGAGCATC TCTGCGACAA 
GTGGAGAAAT TACCACTAAA ACAGGTACAA CCATTAACGC AACCACTGGT AACGTGGAGA 3840 
TAACCGCTCA AACAGGTAGT ATCCTAGGTG GAATTGAGTC CAGCTCTGGC TCTGTAACAC 3900 
TTACTGCAAC CX3AGGGCGCT CTTGCTGTAA GCAATATTTC GGGCAACACC GTTACTGTTA 3960 
CTGCAAATAG CGGTGCATTA ACCACTTTGG CAGGCTCTAC AATTAAAGGA ACCGAGAGTG 4020 
TAACCACTTC AAGTCAATCA GGCGATATCG GCGGTACGAT TTCTGGTGGC ACAGTAGAGG 
TTAAAGCAAC CGAAAGTTTA ACCACTCAAT CCAATTCAAA AATTAAAGGA ACAACAGGCG 
AGGCTAACGT AACAAGTGCA ACAGGTACAA TTGGTGGTAC GATTTCCGGT AATACGGTAA 4200 
ATGTTACGGC AAACGCTGGC GATTTAACAG TTGGGAATGG CGCAGAAATT AATGCGACAG 4260 



2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 



3720 
3780 



4080 
4140 
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AAGGAGCTGC 


AACCTTAACT 


ACATCATCGG 


GCAAATTAAC 


TACCGAAGCT 


AGTTCACACA 


4320 


TTACTTCAGC 


CAAGGGTCAG 


GTAAATCTTT 


CAGCTCAGGA 


TGGTAGCGTT 


GCAGGAAGTA 


4380 


TTAATGCCGC 


CAATGTGACA 


CTAAATACTA 


CAGGCACTTT 


AACTACCGTG 


AAGGGTTCAJEi- 


4440 


ACATTAATGC 


AACCAGCGGT 


ACCTTGGTTA 


TTAACGCAAA 


AGACGCTGAG 


CTAAATGGCG 


4500 


CAGCATTGGG 


TAACCACACA 


GTGGTAAATG 


CAACCAACGC 


AAATGGCTCC 


GGCAGCGTAA 


4560 


TCGCGACAAC 


CTCAAGCAGA 


GTGAACATCA 


CTGGGGATTT 


AATCACAATA 


AATGGATTAA 


4620 


ATATCATTTC 


AAAAAACGGT 


ATAAACACCG 


T ACTG TTAAA 


AGGCGTTAAA 


ATTGATGTGA 


4680 


AATACATTCA 


ACCGGGTATA 


GCAAGCGTAG 


ATGAAGTAAT 


TGAAGCGAAA 


CGCATCCTTG 


4740 


AGAAGGTAAA 


AGATTTATCT 


GATGAAG7U\A 


GAGAAGCGTT 


AGCTAAACTT 


GGAGTAAGTG 


4800 


CTGTACGTTT 


TATTGAGCCA 


AATAATACAA 


TTACAGTCGA 


TACACAAAAT 


GAATTTGCAA 


4860 


CCAGACCATT 


AAGTCGAATA 


GTGATTTCTG 


AAGGCAGGGC 


GTGTTTCTCA 


AACAGTGATG 


4920 


GCGCGACGGT 


GTGCGTTAAT 


ATCGCTGATA 


ACGGGCGGTA 


GCGGTCAGTA 


ATTGACAAGG 


4980 


TAGATTTCAT 


CCTGCAATGA 


AGTCATTTTA 


TTTTCGTATT 


ATTTACTGTG 


TGGGTTAAAG 


5040 


TTCAGTACGG 


GCTTTACCCA 


TCTTGTAAAA 


AATTACGGAG 


AATACAATAA 


AGTATTTTTA 


5100 


ACAGGTTATT 


ATTATG 










5116 



(2) INFORMATION FOR SEQ ID NO : 2 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 3 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Asn Lys lie Tyr Arg Leu Lys Phe Ser Lys Arg Leu Asn Ala Leii 
15 10 15 

Val Ala Val Ser Glu Leu Ala Arg Gly Cys Asp His Ser Thr Glu Lys 
20 25 30 

/ Gly Ser Glu Lys Pro Ala Arg Met Lys Val Arg His Leu Ala Leu Lys 
35 40 45 

Pro Leu Ser Ala Met Leu Leu Ser Leu Gly Val Thr Ser lie Pro Gin 
50 55 60 

Ser Val Leu Ala Ser Gly Leu Gin Gly Met Asp Val Val His Gly Thr 
65 70 75 80 

Ala Thr Met Gin Val Asp Gly Asn Lys Thr lie lie Arg Asn Ser Val 

85 90 95 

Asp Ala lie lie Asn Trp Lys Gin Phe Asn lie Asp Gin Asn Glu Met 
100 105 110 

Val Gin Phe Leu Gin Glu Asn Asn Asn Ser Ala Val Phe Asn Arg Val 
115 120 125 
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Thr Ser Asn Gin He Ser Gin Leu Lys Gly He Leu Asp Ser Asn Gly 
130 135 140 

Gin Val Phe Leu lie Asn Pro Asn Gly He Thr He Gly Lys Asp Ala 
. ISO 155 ^ / ^ 

He He Asn Thr Asn Gly Phe Thr Ala Ser Thr Leu Asp He Ser Asn 

165 170 175 

Glu Asn He Lys Ala Arg Asn Phe Thr Phe Glu Gin Thr Lys Asp Lys 
180 185 190 

Ala Leu Ala Glu He Val Asn His Gly Leu He Thr Val Gly Lys Asp 
195 200 205 

Gly Ser Val Asn Leu He Gly Gly Lys Val Lys Asn Glu Gly Val He 
210 215 220 

Ser Val Asn Gly Gly Ser He Ser Leu Leu Ala Gly Gin Lys He Thr 
225 230 235 240 

He Ser Asp He He Asn Pro Thr He Thr Tyr Ser He Ala Ala Pro 
245 250 255 

Glu Asn Glu Ala Val Asn Leu Gly Asp He Phe Ala Lys Gly Gly Asn 
260 265 270 

He Asn Val Arg Ala Ala Thr He Arg Asn Gin Gly Lys Leu Ser Ala 

275 280 285 

Asp Ser Val Ser Lys Asp Lys Ser Gly Asn He Val Leu Ser Ala Lys 
290 295 300 

Glu Gly Glu Ala Glu He Gly Gly Val He Ser Ala Gin Asn Gin Gin 
305 310 315 320 

Ala Lys Gly Gly Lys Leu Met He Thr Gly Asp Lys Val Thr Leu Lys 
325 330 335 

Thr Gly Ala Val He Asp Leu Ser Gly Lys Glu Gly Gly Glu Thr Tyr 
340 345 350 

Leu Gly Gly Asp Glu Arg Gly Glu Gly Lys Asn Gly He Gin Leu Ala 
355 360 365 

Lys Lys Thr Ser Leu Glu Lys Gly Ser Thr He Asn Val Ser Gly Lys 
370 375 380 

Glu Lys Gly Gly Arg Ala He Val Trp Gly Asp He Ala Leu He Asp 
385 390 395 400 

Gly Asn He Asn Ala Gin Gly Ser Gly Asp He Ala Lys Thr. Gly Gly 
405 410 415 

Phe Val Glu Thr Ser Gly His Asp Leu Phe He Lys Asp Asn Ala He 
420 425 430 

Val Asp Ala Lys Glu Trp Leu Leu Asp Phe Asp Asn Val Ser He Asn 
435 440 445 

Ala Glu Thr Ala Gly Arg Ser Asn Thr Ser Glu Asp Asp Glu Tyr Thr 
450 455 460 

Gly Ser Gly Asn Ser Ala Ser Thr Pro Lys Arg Asn Lys Glu Lys Thr 
465 470 475 480 
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Thr Leu Thr Asn Thr Thr Leu Glu Ser He Leu Lys Lys Gly Thr Phe 
485 490 

Val Asn He Thr Ala Asn Gin Arg He Tyr Val Asn Ser Ser He Asn 
500 505 

Leu Ser Asn Gly Ser Leu Thr Leu Trp Ser Glu Gly Arg Ser Gly Gly 
515 520 _ 52S 

Gly Val Glu He Asn Asn Asp He Thr Thr Gly Asp Asp Thr Arg Gly 

= ->'J 535 540 

Ala Asn Leu Thr He Tyr Ser Gly Gly Trp Val Asp Val His Lys Asn 

550 555 

He Ser Leu Gly Ala Gin Gly Asn He Asn He Thr Ala Lys Gin Asp 
565 570 575 

He Ala Phe Glu Lys Gly Ser Asn Gin Val He Thr Gly Gin Gly Thr 

585 590 

He Thr Ser Gly Asn Gin Lys Gly Phe Arg Phe Asn Asn Val Ser Leu 
595 600 SOS 

Asn Gly Thr Gly Ser Gly Leu Gin Phe Thr Thr Lys Arg Thr Asn Lys 
610 615 620 

Tyr Ala He Thr Asn Lys Phe Glu Gly Thr Leu Asn He Ser Gly Lys 

630 635 640 

Val Asn He Ser Met Val Leu Pro Lys Asn Glu Ser Gly Tyr Asp Lys 
6^5 650 655 

Phe Lys Gly Arg Thr Tyr Trp Asn Leu Thr Ser Leu Asn Val Ser Glu 

665 670 

Ser Gly Glu Phe Asn Leu Thr He Asp Ser Arg Gly Ser Asp Ser Ala 
6^5 680 685 

Gly Thr Leu Thr Gin Pro Tyr Asn Leu Asn Gly He Ser Phe Asn Lys 
oso 695 



700 



Asp Thr Thr Phe Asn Val Glu Arg Asn Ala Arg Val Asn Phe Asp lie 

710 715 

Lys Ala Pro He Gly He Asn Lys Tyr Ser Ser Leu Asn Tyr Ala Ser 
725 730 735 

Phe Asn Gly Asn He Ser Val Ser Gly Gly Gly Ser Val Asp Phe Thr 
740 745 750 

Leu Leu Ala Ser Ser Ser Asn Val Gin Thr Pro Gly Val Val lie Asn 
755 760 765 

Ser Lys Tyr Phe Asn Val Ser Thr Gly Ser Ser Leu Arg Phe Lys Thr 
770 775 

Ser Gly Ser Thr Lys Thr Gly Phe Ser He Glu Lys Asp Leu Thr Leu 

785 790 



795 800 



Asn .Ala Thr Gly Gly Asn He Thr ieu Leu Gin Val Glu Gly Thr Aso 
80S 8X0 815 

Gly Met He Gly Lys Gly He Val Ala Lys Lys Asn He Thr Phe Glu 
820 825 830 
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Gly Gly Asn lie Thr Phe Gly Ser Arg Lys Ala Val Thr Glu He Glu 
835 840 845 

Gly Asn Val Thr He Asn Asn Asn Ala Asn Val Thr Leu He Gly Ser 
. 850 855 860 

Asp Phe Asp Asn His Gin Lys Pro Leu Thr He Lys Lys Asp Val He 

865 870 575 " 880 

He Asn Ser Gly Asn Leu Thr Ala Gly Gly Asn He Val Asn He Ala 
885 890 895 

Gly Asn Leu Thr Val Glu Ser Asn Ala Asn Phe Lys Ala He Thr Asn 

900 905 910 

Phe Thr Phe Asn Val Gly Gly Leu Phe Asd Asn Lys Gly Asn Ser Asn 
915 920 " 925 

He Ser He Ala Lys Gly Gly Ala Arg Phe Lys Asp He Asp Asn Ser 
930 935 940 

Lys Asn Leu Ser He Thr Thr Asn Ser Ser Ser Thr Tyr Arg Thr He 
945 950 SSS 960 

He Ser Gly Asn He Thr Asn Lys Asn Gly Asp Leu Asn He Thr Asn 
965 970 975 

Glu Gly Ser Asp Thr Glu Met Gin He Gly Gly Asp Val Ser Gin Lys 
980 985 990 

Glu Gly Asn Leu Thr He Ser Ser Asp Lys He Asn He Thr Lys Gin 
995 1000 1005 

He Thr He Lys Ala Gly Val Asp Gly Glu Asn Ser Asp Ser Asp Ala 
1010 1015 1020 

Thr Asn Asn Ala Asn Leu Thr He Lys Thr Lys Glu Leu Lys Leu Thr 
1025 1030 1035 1040 

Gin Asp Leu Asn He Ser Gly Phe Asn Lys Ala Glu He Thr Ala Lys 
1045 1050 ^ . 1055 

Asp Gly Ser Asp Leu Thr He Gly T^n Thr Asn Ser Ala Asp Gly Thr 
1060 1065 1070 

Asn Ala Lys Lys Val Thr Phe Asn Gin Val Lys Asp Ser Lys He Ser 
1075 1080 1085 

Ala Asp Gly His Lys Val Thr Leu His Ser Lys Val Glu Thr Ser Gly 
1090 1095 1100 

Ser Asn Asn Asn Thr Glu Asp Ser Ser Asp Asn Asn Ala Gly Leu Thr 
1105 1110 1115 1120 

lie Asp Ala Lys Asn Val Thr Val Asn Asn Asn He Thr Ser His Lys 
1125 1130 1135 

Ala Val Ser He Ser Ala Thr Ser Gly Glu He Thr Thr Lys Thr Gly 
1140 1145 1150 

Thr Thr He Asn Ala Thr Thr Gly Asn Val Glu He Thr Ala Gin Thr 
1155 1160 1165 

Gly Ser He Leu Gly Gly He Glu Ser Ser Ser Gly Ser Val Thr Leu 
1170 1175 1180 
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Thr Ala Thr Glu Gly Ala Leu Ala Val Ser Asn He Ser Gly Asn Xhr 

1195 1200 
Val Thr Val Thr Ala Asn Ser Gly Ala Leu Thr Thr Leu Ala Gly. Ser 
1205 1210 1215 

Thr 113 Lys Gly Thr Glu Ser Val Thr Thr Ser Ser Gin Ser Gly Asp 
1220 1225 ^ 



1230 



lie Gly Gly^Thr He Ser Gly Gly^Thr Val Glu Val ^ys^Ala Thr Glu 

ser Leu Thr Thr Gin Ser Asn Ser Lys He Lys Ala Thr Thr Gly Glu 

1255 1260 

Ala Asn Val Thr Ser Ala Thr Gly Thr He Gly Gly Thr He Ser Gly 

1270 1275 1280 

Asn Thr val Asn Val Thr Ala Asn Ala Gly Asp Leu Thr Val Gly Asn 
1285 1290 1295 

Gly Ala Glu lie Asn Ala Thr Glu Gly Ala Ala Thr Leu Thr Thr Ser 
1300 1305 ^3^0 

Ser Gly Lys Leu Thr Thr Glu Ala Ser Ser His He Thr Ser Ala Lys 
1315 1320 1325 ^ 

Gly Gin Val Asn Leu Ser Ala Gin Asp Gly Ser Val Ala Gly Ser He 
1330 1335 ^3^Q y 

Asn Ala Ala Asn Val Thr Leu Asn Thr Thr Oly Thr Leu Thr Thr Val 

1350 1355 

Lys Gly Ser Asn lie Asn Ala Thr Ser Gly Thr Leu Val He Asn Ala 
1365 1370 ^375 

Lys Asp Ala Glu Leu Asn Gly Ala Ala Leu Gly Asn His Thr Val Val 
1380 1385 1390 

Asn Ala Thr Asn Ala Asn Gly Ser Gly Ser Val He Ala Thr Thr Ser 
1395 1400 3^4Q5 

illo''^^ '^^'^ ?irc''^P "^^^ Asn Gly Leu Asn 

1415 1420 

He He Ser Lys Asn Gly He Asn Thr Val Leu Leu Lys Gly Val Lys 

1430 1435 ^^^^ 

He Asp val Lys Tyr He Gin Pro Gly He Ala Ser Val Asp Glu Val 
"'^S 14S0 *^ 3^455 

He Glu Ala Lys Arg He Leu Glu Lys Val Lys Asp Leu Ser Asp Glu 
1460 1465 



1470 



Glu Arg Glu Ala Leu Ala Lys Leu Gly Val Ser Ala Val Arg Phe He 

1480 148S 

l^lo^° l^S^^^ Asn Glu Phe Ala Thr 



1500 



Arg Pro Leu Ser Arg lie Val He Ser Glu Gly Arg Ala Cys Phe Ser 

^S^O 1S15 1520 

Asn Ser Asp Gly Ala Thr Val Cys Val Asn He Ala Asp Asn Gly Arg 
1S2S 1530 j_s3s 
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(2) INFORI^TION FOR SEQ ID NO : 3 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4937 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 



TAAATATACA AGATAATAAA 


. AATAAATCAA 


GATTTTTGTG 


ATGACAAACA 


ACAATTACAA 


60 


CACCTTTTTT 




TGCAAATATT 


TTAAAAAAAT 


AGTATAAATC 


CGCCATATAA 


120 


AATGGTATAA 


TCTTTCATPT 


TTCATCTTTA 


ATCTTTCATC 


TTTCATCTTT 


CATCTTTCAT 


180 


CTTTCATCTT 


TCATCTTTCA 


TCTTTCATCT 


TTCATCTTTC 


ATCTTTCATC 


TTTCATCTTT 


240 


CACATGAAAT 


GATGAACCGA 


GGGAAGGGAG 


GGAGGGGCAA 


GAATGAAGAG 


GGAGCTGAAC 


300 


GAACGCAAAT 


GATAAAGTAA 


TTTAATTGTT 


CAACTAACCT 


TAGGAGAAAA 


TATGAACAAG 


360 


ATATATCGTC 


TCTIAATTCAG 


CTUU^CGCCTG 


AATGCTTTGG 


TTGCTGTGTC 


TGAATTGGCA 


420 


CGGGGTTGTG 


ACCATTCCAC 


AGAAAAAGGC 


TTCCGCTATG 


TTACTATCTT 


TAGGTGTAAC 


480 


CACTTAGCGT 


T AAAG CCACT- 


TTCCGCTATG 


TTACTATCTT 


TAGGTGTAAC 


ATCTATTCCA 


540 


CAATCTGTTT 


TAGCAAG CGG 


CTTACAAGGA 


ATGGATGTAG 


TACACGGCAC 


AGCCACTATG 


600 


CAAGTAGATG 


GTAATAAAAC 


CATTATCCGC 


AACAGTGTTG 


ACGCTATCAT 


TAATTGGAAA 


660 


CAATTTAACA 


TCGACCAAAA 


TGAAATGGTG 


CAGTTTTTAC 


AAGAAAACAA 


CAACTCCGCC 


720 


GTATTCAACC 


GTGTTACATC 


TAACCAAATC 


TCCCAATTAA 


AAGGGATTTT 


AGATTCTAAC 


780 


GGACAAGTCT 


TTTTAATCAA 


CCCAAATGGT 


ATCACAATAG 


GTAAAGACGC 


AATTATTAAC 


840 


ACTAATGGCT 


TTACGGCTTC 


TACGCTAGAC 


ATTTCTAACG 


AAAACATCAA 


GGCGCGTAAT 


900 




AGCT^CCAA 


AGATAAAGCG 


CTCGCTGAAA 


TTGTGAATCA 


CGGTTTAATT ' 


960 


ACTGTCGGTA 


AAGACGGCAG 


TGTAAATCTT 


ATTGGTGGCA 


AAGTGAAAAA 


CGAGGGTGTG 


1020 


ATTAGCGTAA 


ATGGTGGCAG 


CATTTCTTTA 


CTCGCAGGGC 


AAAAAATCAC 


CATCAGCGAT 


1080 


ATAATAAACC 


CAACCATTAC 


TTACAGCATT 


GCCGCGCCTG 


AAAATGAAGC 


GGTCAATCTG 


1140 


GGCGATATTT 


TTGCCAAAGG 


CGGTAACATT 


AATGTCCGTG 


CTGCCACTAT 


TCGAAACCAA 


1200 


GGTAAACTTT 


CTGCTGATTC 


TGTAAGCAAA 


GATAAAAGCG 


GCAATATTGT 


TCTTTCCGCC 


1260 


AAAGAGGGTG 


AAGCGGAAAT 


TGGCGGTGTA 


ATTTCCGCTC 


AAAATCAGCA AGCTAAAGGC 


1320 


GGCAAGCTGA 


TGATTACAGG 


CGATAAAGTC 


ACATTAAAAA 


CAGGTGCAGT 


TATCGACCTT 


1380 


TCAGGTAAAG 


AAGGGGGAGA AACTTACCTT 


GGCX3GTGACG AGCGCGGCGA AGGTAAAAAC 


1440 


GGCATTCAAT 


TAGCAAAGAA 


AACCTCTTTA 


GAAAAAGGCT 


CAACCATCAA 


TGTATCAGGC 


1500 


AAAGAAAAAG 


GCGGACGCGC 


TATTGTGTGG 


GGCGATATTG 


CGTTAATTGA 


CGGCAATATT 


1560 


AACGCTCAAG 


GTAGTGGTGA 


TATCGCTAAA 


ACCGGTGGTT 


TTGTGGAGAC 


ATCGGGGCAT 


1620 


TATTTATCCA 


TTGACAGCAA 


TGCAATTGTT 


AAAACAAAAG 


AGTGGTTGCT 


AGACCCTGAT 


1680 
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GATGTAACAA TTGAAGCCGA AGACCCCCTT CGCAATAATA CCGGTATAAA TGATGAATTC 174 o 

CCAACAGGCA CCGGTGAAGC AAGCGACCCT AAAAAAAATA GCGAACTCAA AACAACGCTA 1800 
ACCAATACAA CTATTTCAAA TTATCTGAAA AACGCCTGGA CAATGAATAT A-^CGGCATCA I860 

AGAAAACTTA CCGTTAATAG CTCAATCAAC ATCGGAAGCA ACTCCCACTT AATTCTCCAT 192 0 

AGTAAAGGTC AGCGTGGCGG AGGCGTTCAG ATTGATGGAG ATATTACTTC TAAAGGCGGA 198 0 

AATTTAACCA TTTATTCTGG CGGATGGGTT GATGTTCATA AAAATATTAC GCTTGATCAG 2 04 0 

GGTTTTTTA.A ATATTACCGC CGCTTCCGTA GCTTTTGAAG GTGGAAATAA CAAAGCACGC 2100 

GACGCGGCAA ATGCTAAAAT TGTCGCCCAG GGCACTGTAA CCATTACAGG AGAGGGAAAA 2160 

GATTTCAGGG CTAACAACGT ATCTTTAAAC GGAACGGGTA AAGGTCTGAA TATCATTTCA 2220 

TCAGTGAATA ATTTAACCCA CAATCTTAGT GGCACAATTA ACATATCTGG GAATATAACA 2280 

ATTAACCAAA CTACGAGAAA GAACACCTCG TATTGGCAAA CCAGCCATGA TTCGCACTGG 2 34 0 

AACGTCAGTG CTCTTAATCT AGAGACAGGC GCA/^TTTTA CCTTTATTAA ATACATTTCA 2 400 

AGCAATAGCA AAGGCTTAAC AACACAGTAT AGAAGCTCTG CAGGGGTGAA TTTTAACGGC 2 460 

GTAAATGGCA ACATGTCATT CAATCTCAAA GAAGGAGCGA AAGTTAATTT CAAATTAAAA 2 520 

CCAAACGAGA ACATGAACAC AAGCAAACCT TTACCAATTC GGTTTTTAGC CAATATCACA 2 58 0 

GCCACTGGTG GGGGCTCTGT TTTTTTTGAT ATATATGCCA ACCATTCTGG CAGAGGGGCT 2 64 0 

GAGTTAAAAA TGAGTGAAAT TAATATCTCT AACGGCGCTA ATTTTACCTT AAATTCCCAT 2 700 

GTTCGCGGCG ATGACGCTTT TAAAATCAAC AAAGACTTAA CCATAAATGC AACCAATTCA 2 76 0 

AATTTCAGCC TCAGACAGAC GAAAGATGAT TTTTATGACG GGTACGCACG CAATGCCATC 2820 

AATTCAACCT ACAACATATC CATTCTGGGC GGTAATGTCA CCCTTGGTGG ACAAAACTCA 2880 

AGCAGCAGCA TTACGGGGAA TATTACTATC GAGAAAGCAG CAAATGTTAC GCTAGAAGCC 2 940 

AATAACGCCC CTAATCAGCA AAACATAAGG GATAGAGTTA TAAAACTTGG CAGCTTGCTC 3 000 

GTTAATGGGA GTTTAAGTTT AACTGGCGAA AATGCAGATA TTAAAGGCAA TCTCACTATT 3060 

TCAGAAAGCG CCACTTTTAA AGGAAAGACT AGAGATACCC TAAATATCAC CGGCAATTTT 3120 

ACCAATAATG GCACTGCCGA AATTAATATA ACACAAGGAG TGGTAAAACT TGGCAATGTT 3180 

ACGAATGATG GTGATTTAAA CATTACCACT CACGCTAAAC GCAACCAAAG AAGCATCATC 3240 

GGCGGAGATA TAATCAACAA AAAAGGAAGC TTAAATATTA CAGACAGTAA TAATGATGCT 3300 

GAAATCCAAA TTGGCGGCAA TATCTCGCAA AAAGAAGGCA ACCTCACGAT TTCTTCCGAT 3360 

AAAATTAATA TCACCAAACA GATAACAATC AAAAAGGGTA TTGATGGAGA GGACTCTAGT 3420 

TCAGATGCGA CAAGTAATGC CAACCTAACT ATTAAAACCA AAGAATTGAA ATTGACAGAA 3480 

GACCTAAGTA TTTCAGGTTT CAATAAAGCA GAGATTACAG CCAAAGATGG TAGAGATTTA 3540 

ACTATTGGCA ACAGTAATGA CGGTAACAGC GGTGCCGAAG CCAAAACAGT AACTTTTAAC 3600 

AATGTTAAAG ATTCAAAAAT CTCTGCTGAC GGTCACAATG TGACACTAAA TAGCAAAGTG 3660 

AAAACATCTA GCAGCAATGG CGGACGTGAA AGCAATAGCG ACAACGATAC CGGCTTAACT 3720 



v...'- 
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^rar.o. aotaaac^ c.tatt.ctt ctctc^c .ct;uat.xc 

ACCOCCTCCC A^co^.c CCCTCCACCA ^^COC^^c AAATOCCAAA 

CCA.CT.^. C^cC^c AOOTC.TATC .CCCCTACOA TTTCCCOTAA C.CCOTAAOT 
CTT.OCCCO. CTOCTOATTT AACCACTA^ TCCCCCTCAA AAA^OAACC CAAATCOOCT 

c.occt;^tc taacaactoc aacaoctaca ArrccccGTA CAATTTCCGO t;^tacogta 

AATCTTACOO CAAACCCTCC CCAT.~rAACA CXTCCCAATC OCOCACAAAT TAATCCCACA 
GAAGOAGCTG CAACCTTAAC CGCAACAGGG AATACCTTGA CTACTGAAGC CGGTTCTAGC 
ATCACTTCAA CTAAGGGTCA GGTAGACCTC TTGGCTCAGA ATGGTAGCAT CGCAGGAAGC 
ArTAATGCTG CTAATGTGAC ATTAAATACT ACAGGCACCT TAACCACCGT GGCAGGCTCG 
GATATTAAAG CAACCAGCCG CACCTTGOTT A.TAACGCAA AAGATGCTAA GCTAAATGGT 
OATGCATCAG GTGATAGTAC AGAAGTOAAT GCAOTCAACG CAAGCOGCTC TGGTAGTGTG 
AC^CGGCAA CCTCAAGCAG ^^AATATC ACT^GA^ TAAACACAGT AAATOGGTTA 
AATATCA^ COAAAGATGG TAGAAACACT GTGCGC™ GAOGCAAGGA AATTGAOOTO 
AAATATATCC AGCCAGGTGT AGdAAGTGTA GAAGAAGTAA TTGAAGCGAA ACGCGTCCTT 
GAAAAAGTAA AAGATTTATC TGATGAAGAA AGAGAAACAT TACCTAAACT TGGTGTAAGT 
GCTGTACGTT ^G^GAGCC AAATAATACA A^ACAGTCA ATACACAAAA TGAA^CA 
ACCAGACCGT CAAGTCAAGT GATAATTTCT GAAGGTAAGG CGTCTTTCTC AAGTGGTAAT 
GGCGCACGAG TATGTACCAA TGTTGCTGAC GATGGACAGC CGTAGTCAGT AATTGACAAG 
GTAGArrXCA TCCTX.CAATG AAGTCArrXT ATTTTCGTAT TAt^ACT^T GTX.GG^;^ 
GTTCAGTACG GGCT^ACCC ATCOTGTAAA AAATTACGGA GAATACAATA AAGTA.^ 
AACAGGTTAT TATTATG 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 1477 amino acids 
<B) TYPE: amino acid 

(C) STRANDEDNESS : sinole 

(D) TOPOUXSYt linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 

Met Asn Lys He Tyr Arg Leu Lys Phe Ser Lys Arg Leu Asn Ala Leu 
Val Ala Val Ser Glu Leu Ala Arg Gly Cys Asp His Ser Thr Glu Lys 
Gly ser Glu I.ys Pro Ala Arg Met, Lys Val Arg His Leu Ala Leu Lys 



3 780 

3840 

3900 

3960 

4020 

4080 

4140 

4200 

4260 

4320 

4380 

4440 

4500 

4560 

4620 

4680 

4740 

4800 

4860 

4920 

4937 



Pro Leu ser Ala Met Leu Leu Ser Leu Gly Val xhr Ser He Pro Gin 



60 
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Ser Val Leu Ala Ser Gly Leu Gin Gly Met Asp Val Val His Gly Thr 

■'O 75 80 

Ala Thr Met Gin Val Asp Gly Asn Lys Thr He He Arg Asn Ser Val 
85 90 95 

Asp Ala lie He Asn Trp Lys Gin Phe Asn He Asp Gin Asn Glu Met 
100 105 

Val Gin Phe Leu Gin Glu Asn Asn Asn Ser Ala Val Phe Asn Arq Val 
US 120 125 

Thr Ser Asn Gin He Ser Gin Leu Lys Gly He Leu Asp Ser Asn Gly 

135 -j^^Q 

Gin Val Phe Leu He Asn Pro Asn Gly He Thr He Gly Lys Asp Ala 

ISO 155 j^gQ 

He He Asn Thr Asn Gly Phe Thr Ala Ser Thr Leu Asp lie Ser Asn 
165 170 175 

Glu Asn He Lys Ala Arg Asn Phe Thr Phe Glu Gin Thr Lys Asp Lvs 
180 185 190 

Ala Leu Ala Glu He- Val Asn His Gly Leu He Thr Val Gly Lys Asp 
195 200 205 

Gly Ser Val Asn Leu He Gly Gly Lys Val Lys Asn Glu Gly Val He 
210 215 220 

Ser Val Asn Gly Gly Ser He Ser Leu Leu Ala Gly Gin Lys He Thr 

230 235 240 

He Ser Asp He He Asn Pro Thr He Thr Tyr Ser He Ala Ala Pro 
245 250 255 

Glu Asn Glu Ala Val Asn Leu Gly Asp He Phe Ala Lys Gly Gly Asn 
260 265 270 

He Asn Val Arg Ala Ala Thr He Arg Asn Gin Gly Lys Leu Ser Ala 
275 280 285 * . 

of« ^^"^ Asn He Val Leu Ser Ala Hys 

290 295 300 

Glu Gly Glu Ala Glu lie Gly Gly Val He Ser Ala Gin Asn Gin Gin 

310 315 

Ala Lys Gly Gly Lys Leu Met He Thr Gly Asp Lys Val Thr Leu Lys 
325 330 335 

Thr. Gly Ala Val He Asp Leu Ser Gly Lys Glu Gly Gly Glu Thr Tvr 
340 345 350 

Leu Gly Gly Asp Glu Arg Gly Glu Gly Lys Asn Gly He Gin Leu Ala 
355 360 365 

Lys Lys Thr Ser Leu Glu Lys Gly Ser Thr He Asn Val Ser Gly Lys 
370 375 380 

Glu Lys Gly Gly Phe Ala He Val Trp Gly Asp He Ala Leu He Asp 

390 395 400 

Gly Asn He Asn Ala Gin Gly Ser Gly Asp He Ala Lys Thr Gly Gly 
405 410 415 
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Phe Val Glu Thr Ser Gly His Asp Leu Phe He Lys Asp Asn Ala He 
"20 425 

val Asp Ala Lys Glu Trp Leu Leu Asp Phe Asp Asn Val Ser He Asn 

440 445 

Ala Glu Asp Pro Leu Phe Asn Asn Thr Gly He Asn Asp Glu Phe Pro 

455 460 

Thr Gly Thr Gly Glu Ala Ser Asp Pro Lys Lys Asn Ser Glu Leu Lys 

470 475 

Thr Thr Leu Thr Asn Thr Thr He Ser Asn Tyr Leu Lys Asn Ala Trp 
485 490 

Thr Met Asn He Thr Ala Ser Arg Lys Leu Thr Val Asn Ser Ser He 

SOS 510 

Asn He Gly Ser Asn Ser His Leu He Leu His Ser Lys Gly Gin Aro 
^1= 520 525 

Gly Gly Gly Val Gin He Asp Gly Asp He Thr Ser Lys Gly Gly Asn 

Leu Thr. He Tyr Ser Gly Gly Trp Val Asp Val His Lys Asn He Thr 

550 555 

Leu Asp Gin Gly Phe Leu Asn He Thr Ala Ala Ser Val Ala Phe Glu 

565 570 

Gly Gly Asn Asn Lys Ala Arg Asp Ala Ala Asn Ala Lys He Val Ala 
580 S8S 5go 

Gin Gly Thr Val Thr He Thr Gly Glu Gly Lys Asp Phe Arg Ala Asn 

600 

Asn Val Ser Leu Asn Gly Thr . Gly Lys Gly Leu Asn' He He Ser Ser 
610 615 

val Asn Asn Leu Thr His Asn Leu Ser Gly Thr He Asn He Ser Gly 

^30 635 640 

Asn He Thr He Asn Gin Thr Thr Arg Lys Asn Thr Ser Tyr Trp Gin 
645 650 655 

Thr Ser His Asp Ser His Trp Asn Val ser Ala Leu Asn Leu Glu Thr 

665 670 

. Gly Ala Asn Phe Thr Phe He Lys Tyr He Ser Ser Asn Ser Lys Gly 
675 680 685 

Leu Thr Thr Gin Tyr Arg Ser Ser Ala Gly Val Asn Phe Asn Gly Val 

695 

Asn Gly Asn Met Ser Phe Asn Leu Lys Glu Gly Ala Lys Val Asn Phe 

71S 720 

Lys Leu Lys Pro Asn Glu Asn Met Asn Thr Ser Lys Pro Leu Pro He 
''^S 730 735 

Arg Phe Leu Ala Asn He Thr Ala Thx Gly Gly Gly Ser Val Phe Phe 
740 745 

Asp He Tyr Ala Asn His Ser Gly Arg Gly Ala Glu Leu Lys Met Ser 
755 760 765 
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Glu lie Asn lie Ser Asn Gly Ala Asn Phe Thr Leu Asn Ser His Val 
770 775 780 

Arg Gly Asp Asp Ala Phe Lys lie Asn Lys Asp Leu Thr lie Asn Ala 
785. 790 795 800 

Thr Asn Ser Asn Phe Ser Leu Arg Gin Thr Lys Asp Asp Phe Tyr Asp 

805 810 " 815 

Gly Tyr Ala Arg Asn Ala lie Asn Ser Thr Tyr Asn lie Ser lie Leu 
820 825 830 

Gly Gly Asn Val Thr Leu Gly Gly Gin Asn Ser Ser Ser Ser lie Thr 
835 840 845 

Gly Asn lie Thr lie Glu Lys Ala Ala Asn Val Thr Leu Glu Ala Asn 
850 855 860 

Asn Ala Pro Asn Gin Gin Asn lie Arg Asp Arg Val lie Lys Leu Gly 
865 870 875 880 

Ser Leu Leu Val Asn Gly Ser Leu Ser Leu Thr Gly Glu Asn Ala Asp 
885 890 895 

lie Lys Gly Asn Leu Thr lie Ser Glu Ser Ala Thr Phe Lys Gly Lys 
900 905 910 

Thr Arg Asp Thr Leu Asn lie Thr Gly Asn Phe Thr Asn Asn Gly Thr 
915 920 925 

Ala Glu lie Asn lie Thr Gin Gly Val Val Lys Leu Gly Asn Val Thr 
930 935 940 

Asn Asp Gly Asp Leu Asn He Thr Thr His Ala Lys Arg Asn Gin Arg 
945 950 955 960 

Ser He lie Gly Gly Asp He He Asn Lys Lys Gly Ser Leu Asn He 
965 970 975 

Thr Asp Ser Asn Asn Asp Ala Glu He Gin He Gly Gly Asn He Ser 
980 985 990 

Gin Lys Glu Gly Asn Leu Thr He Ser Ser Asp Lys He Asn He Thr 
995 1000 1005 

Lys Gin He Thr He Lys Lys Gly He Asp Gly Glu Asp Ser Ser Ser 
1010 1015 1020 

Asp Ala Thr Ser A§n Ala Asn Leu Thr He Lys Thr Lys Glu Leu Lys 
1025 1030 1035 1040 

Leu Thr Glu Asp Leu Ser He Ser Gly Phe Asn Lys Ala Glu He Thr 
1045 1050 1055 

Ala Lys Asp Gly Arg Asp Leu Thr He Gly Asn Ser Asn Asp Gly Asn 
1060 1065 1070 

Ser Gly Ala Glu Ala Lys Thr Val Thr Phe Asn Asn Val Lys Asp Ser 
1075 1080 1085 

Lys He Ser Ala Asp Gly His Asn Val Thr Leu Asn Ser Lys Val Lys 
1090 1095 1100 

Thr Ser Ser Ser Asn Gly Gly Arg Glu Ser Asn Ser Asp Asn Asp Thr 
1105 1110 1115 1120 
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Gly Leu Thr lie Thr Ala Lys Asn Val Glu Val Asn Lys Asp lie Thr 
1125 1130 ' 1135 

Ser* Leu Lys Thr Val Asn lie Thr Ala Ser Glu Lys Val Thr Thr Thr 
1140 1145 1150 

Ala Gly Ser Thr He Asn Ala Thr Asn Gly Lys Ala Ser He Thr Thr 
1155 1160 1165 

Lys Thr Gly Asp. lie Ser Gly Thr He Ser Gly Asn Thr Val Ser Val 
1170 1175 1180 

Ser Ala Thr Val Asp Leu Thr Thr Lys Ser Gly Ser Lys He Glu Ala 
1185 1190 1195 1200 

Lys Ser Gly Glu Ala Asn Val Thr Ser Ala Thr Gly Thr He Gly Gly 
1205 1210 1215 

Thr He Ser Gly Asn Thr Val Asn Val Thr Ala Asn Ala Gly Asp Leu 
1220 1225 1230 

Thr Val Gly Asn Gly Ala Glu He Asn Ala Thr Glu Gly Ala Ala Thr 
1235 1240 1245 

Leu Thr Ala Thr Gly Asn Thr Leu Thr Thr Glu Ala Gly Ser Ser He" 
1250 1255 1260 

Thr Ser Thr Lys Gly Gin Val Asp Leu Leu Ala Gin Asn Gly Ser He 
1265 1270 1275 1280 

Ala Gly Ser He Asn Ala Ala Asn Val Thr Leu Asn Thr Thr Gly Thr 
1285 1290 1295 

Leu Thr Thr Val Ala Gly Ser Asp He Lys Ala Thr Ser Gly Thr Leu 
1300 1305 1310 

Val He Asn Ala Lys Asp Ala Lys Leu Asn Gly Asp Ala Ser Gly Asp 
1315 1320 1325 

Ser Thr Glu Val Asn Ala Val Asn Ala Ser Gly Ser Gly Ser Val Thr 
1330 1335 1340 

Ala Ala Thr Ser Ser Ser Val Asn He Thr Gly Asp Leu Asn Thr Val 
1345 1350 1355 1360 

Asn Gly Leu Asn He He Ser Lys Asp Gly Arg Asn Thr Val Arg Leu 
1365 1370 1375 

Arg Gly Lys Glu He Glu Val Lys Tyr He Gin Pro Gly Val Ala Ser 
1380 1385 1390 

Val Glu Glu Val He Glu Ala Lys Arg Val I^eu Glu Lys Val Lys Asp 
1395 1400 1405 

Leu Ser Asp Glu Glu Arg Glu Thr lieu Ala Lys Leu Gly Val Ser Ala 
1410 1415 1420 

Val Arg Phe Val Glu Pro Asn Asn Thr He Thr Val Asn Thr Gin Asn 
1425 1430 1435 1440 

Glu Phe Thr Thr Arg Pro Ser Ser Gin val He He Ser Glu Gly Lys 
1445 1450 1455 

Ala Cys Phe Ser Ser Gly Asn Gly Ala Arg Val Cys Thr Asn Val Ala 
1460 1465 1470 
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Asp Asp Gly Gin Pro 
1475 

(2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9171 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 



ACAGCGTTCT 


CTTAATACTA 


GTACAAACCC 


ACAATAAAAT 


ATGACAAACA 


ACAATTACAA 


60 


CACCTTTTTT 


GCAGTCTATA 


TGCAAATATT 


TT7V7UUUVATA 


GTATAAATCC 


GCCATATAAA 


120 


ATGGTATAAT 


CTTTCATCTT 


TCATCTTTCA 


TCTTTCATCT 


TTCATCTTTC 


ATCTTTCATC 


18 0 


TTTCATCTTT 


CATCTTTCAT 


CTTTCATCTT 


TCATCTTTCA 


TCTTTCATCT 


TTCATCTTTC 


240 


ACATGAAATG 


ATGAACCGAG 


GGAAGGGAGG 


GAGGGGCAAG 


AATGAAGAGG 


GAGCTGAACG 


300 


AACGCAAATG 


ATAAAGTAAT 


TTAATTGTTC 


AACTAACCTT 


AGGAGAAAAT 


ATGAACAAGA 


360 


TATATCGTCT 


CAAATTCAGC 


AAACGCCTGA 


ATGCTTTGGT 


TGCTGTGTCT 


GAATTGGCAC 


420 


GGGGTTGTGA 


CCATTCCACA 


GAAAAAGGCA 


GCGAAAAACC 


TG'CTCGCATG 


AAAGTGCGTC 


480 


ACTTAGCGTT 


AAAGCCACTT 


TCCGCTATGT 


TACTATCTTT 


AGGTGTAACA 


TCTATTCCAC 


540 


AATCTGTTTT 


AGCAAGCGGC 


TTACAAGGAA 


TGGATGTAGT 


ACACGGCACA 


GCCACTATGC 


600 


AAGTAGATGG 


TAATAAAACC 


ATTATCCGCA 


ACAGTGTTGA 


CGCTATCATT 


AATTGGAAAC 


660 


AATTTAACAT 


CGACCAAAAT 


GAAATGGTGC 


AGTTTTTACA 


AGAAAACAAC 


AACTCCGCCG 


720 


TATTCAACCG 


TGTTACATCT 


AACCATUVTCT 


CCCAATTAAA 


AGGGATTTTA 


GATTCTAACG - 


780 


GACAAGTCTT 


TTTAATCAAC 


CCAAATGGTA 


TCACAATAGG 


TAAAGACGCA 


ATTATTAACA 


840 


CTAATGGCTT 


TACGGCTTCT 


ACGCTAGACA 


TTTCTAACGA 


AAACATCAAG 


GCGCGTAATT 


900 


TCACCTTCGA 


GCAAACCAAA 


GATAAAGCGC 


tCGCTGAAAT 


TGTGAATCAC 


GGTTTAATTA 


960 


CTOTCGGTAA 


AGACGGCAGT 


GTAAATCTTA 


TTGGTGGCAA 


AGTGAAAAAC 


GAGGGTGTGA 


1020 


TTAGCGTAAA 


TGGTGGCAGC 


ATTTCTTTAC 


TCGCAGGGCA 


AAAAATCACC 


ATCAGCGATA 


1080 


TAATAAACCC 


AACCATTACT 


TACAGCATTG 


CCGCGCCTGA AAATGAAGCG 


GTCAATCTGG 


1140 


GCGATATTTT 


TGCCAAAGGC 


GGTAACATTA 


ATGTCCX3TGC 


TGCCACTATT 


CGAAACC7UVG 


1200 


CTTTCCGCCA 


AAGAGGGTGA 


AGCGGAAATT 


GGCGGTGTAA 


TTTCCGCTCA 


AAATCAGCAA 


1260 


GCTAAAGGCG 


GCAAGCTGAT 


GATTACAGGC 


GATAAAGTCA 


CATTAAAAAC 


AGGTGCAGTT 


1320 


ATCGACCTTT 


CAGGTAAAGA 


AGGGGGAGAA 


ACTTACCTTG GCGGTGACGA GOGCGGCGAA 


1380 


GGTAAAAACG 


GCATTCAATT 


AGCAAAGAAA 


ACCTCTTTAG 


AAAAAGGCTC 


AACCATCAAT 


1440 


GTATCAGGCA 


AAGAAAAAGG 


CGGACGCGCT 


ATTGTGTGGG 


GCGATATTGC 


GTTAATTGAC 


1500 
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GGCAATATTA ACGCTCAAGG TAGTGGTGAT ATCGCTAAAA CCGGTGGTTT TGTGGAGACG 
TCGGGGCATG ATTTATTCAT CAAAGACAAT GCAATTGTTG ACGCCAAAGA GTGGTTGTTA 
GACCCGGATA ATGTATCTAT TAATGCAGAA ACAGCAGGAC GCAGCAATAC TTCAGAAGAC 
GATGAATACA CGGGATCCGG GAATAGTGCC AGCACCCCAA AACGAAACAA AGAAAAGACA 
ACATTAACAA ACACAACTCT TGAGAGTATA CTAAAAAAAG GTACCTTTGT TAACATCACT 
GCTAATCAAC GCATCTATGT CAATAGCTCC ATTAATTTAT CCAATGGCAG CTTAACTCTT 
TGGAGTGAGG GTCGGAGCGG TGGCGGCGTT GAGATTAACA ACGATATTAC CACCGGTGAT 
GATACCAGAG GTGCAAACTT AACAATTTAC TCAGGCGGCT GGGTTGATGT TCATAAAAAT 
ATCTCACTCG GGGCGCAAGG TAACATAAAC ATTACAGCTA AACAAGATAT CGCCTTTGAG 
AAAGGAAGCA ACCAAGTCAT TACAGGTCAA GGGACTATTA CCTCAGGCAA TCAAAAAGGT 
TTTAGATTTA ATAATGTCTC TCTAAACGGC ACTGGCAGCG GACTGCAATT CACCACTAAA 
AGAACCAATA AATACGCTAT CACAAATAAA TTTGAAGGGA CTTTAAATAT TTCAGGGA7UV 
GTGAACATCT CAATGGTTTT ACCTAAAAAT GAAAGTGGAT ATGATAAATT CAAAGGACGC 
ACTTACTGGA ATTTAACCTC GAAAGTGGAT ATGATAAATT CAAAGGACGC CCTCACTATT 
GACTCCAGAG GAAGCGATAG TGCAGGCACA CTTACCCAGC CTTATAATTT AAACGGTATA 
TCATTCAACA AAGACACTAC CTTTAATGTT GAACGAAATG C^AGAGTCAA CTTTGACATC 
AAGGCACCAA TAGGGATAAA TAAGTATTCT AGTTTGAATT ACGCATCATT TAATGGAAAC 
ATTTCAGTTT CGGGAGGGGG GAGTGTTGAT TTCACACTTC TCGCCTCATC CTCTAACGTC 
CAAACCCCCG GTGTAGTTAT AAATTCTAAA TACTTTTVATG TTTCAACAGG GTCAAGTTTA 
AGATTTAAAA CTTCAGGCTC AACAAAAACT GGCTTCTCAA TAGAGAAAGA TTTAACTTTA 
AATGCCACCG GAGGCAACAT AACACTTTTG CAAGTTGAAG GCACCGATGG AATGATTGGT 
AAAGGCATTG TAGCCAAAAA AAACATAACC TTTGAAGGAG GTAAGATGAG GTTTGGCTCC ' 
AGGAAAGCCG TAACAGAAAT CGAAGGCAAT GTTACTATCA ATAACAACGC TAACGTCACT 
CTTATCGGTT CGGATTTTGA CAACCATCAA AAACCTTTAA CTATTAAAAA AGATGTCATC 
AJTAATAGCG GCAACCTTAC CGCTGGAGGC AATATTGTCA ATATAGCCGG AAATCTTACC 
GITGAAAGTA ACGCTAATTT CAAAGCTATC ACAAATTTCA CTTTTAATGT AGGCGGCTTG 
TTTGACAACA AAGGCAATTC AAATATTTCC ATTGCCAAAG GAGGGGCTCG CTTTAAAGAC 
ATTGATAATT CCAAGAATTT AAGCATCACC ACCAACTCCA GCTCCACTTA CCGCACTATT 
ATAAGCGGCA ATATAACCAA TAAAAACGGT GATTTAAATA TTACGAACGA AGGTAGTGAT 
ACTGAAATGC AAATTGGCGG CGATGTCTCG CAAAAAGAAG GTAATCTCAC GATTTCTTCT 
GACAAAATCA ATATTACCAA ACAGATAACA ATCAAGGCAG GTGTTGATGG GGAGAATTCC 3360 
GATTCAGACG CGACAAACAA TGCCAATCTA ACCATTAAAA CCAAAGAATT GAAATTAACG 3420 
CAAGACCTAA ATATTTCAGG TTTCAATAAA GCAGAGATTA CAGCTAAAGA TGGTAGTGAT 3480 
TTAACTATTG GTAACACCAA TAGTGCTGAT GGTACTAATG CCAAAAAAGT AACCTTTAAC 3540 



1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
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CAGGTTAAAG ATTCAAAA?VT CTCTGCTGAC GGTCACAAGG TGACACTACA CAGCAAAGTG 3600 

GAAACATCCG GTAGTAATAA CAACACTGAA GATAGCAGTG ACAATAATGC CGGCTTAACT 3660 

ATCGATGCAA AAAATGTAAC AGTAAACAAC AATATTACTT CTCACAAAGC AGTGAGCATC 3 720 

TCTGCGACAA GTGGAGAAAT TACCACTAAA ACAGGTACAA CCATTAACGC AACCACTGGT 3 780 

AACGTGGAGA TAACCGCTCA AACAGGTAGT ATCCTAGGTG GAATTGAGTC CAGCTCTGGC 3 840' 

TCTGTAACAC TTACTGCAAC CGAGGGCGCT CTTGCTGTAA GCAATATTTC GGGCAACACC 3 900 

GTTACTGTTA CTGCAAATAG CGGTGCATTA ACCACTTTGG CAGGCTCTAC AATTAAAGGA 3 960 

ACCGAGAGTG TAACCACTTC AAGTCAATCA GGCGATATCG GCGGTACGAT TTCTGGTGGC 4 020 

ACAGTAGAGG TTAAAGCAAC CGAAAGTTTA ACCACTCAAT CCAATTCAAA AATTAAAGGA 4 080 

ACAACAGGCG AGGCTAACGT AACAAGTGCA ACAGGTACAA TTGGTGGTAC GATTTCCGGT 4140 

AATACGGTAA ATGTTACGGC AAACGCTGGC GATTTAACAG TTGGGAATGG CGCAGAAATT 4200 

AATGCGACAG AAGGAGCTGC AACCTTAACT ACATCATCGG GCAAATTAAC TACCGAAGCT 4260 

AGTTCACACA TTACTTCAGC CAAGGGTCAG GTAAATCTTT CAGCTCAGGA TGGTAGCGTT 4320 

GCAGGAAGTA TTAATGCCGC CAATGTGACA CTAAATACTA CAGGCACTTT AACTACCGTG 4 3 80 

AAGGGTTCAA ACATTAATGC AACCAGCGGT ACCTTGGTTA TTAACGCAAA AGACGCTGAG 4 44 0 

CTAAATGGCG CAGCATTGGG TAACCACACA GTGGTAAATG CAACCAACGC AAATGGCTCC 4 5 00 

GGCAGCGTAA TCGCGACAAC CTCAAGCAGA GTGAACATCA CTGGGGATTT AATCACAATA 4 56 0 

AATGGATTAA ATATCATTTC AAAAAACGGT ATAAACACCG TACTGTTAAA AGGCGTTAAA 462 0 

ATTGATGTGA AATACATTCA ACCGGGTATA GCAAGCGTAG ATGAAGTAAT TGAAGCGAAA 4 68 0 

CGCATCCTTG AGAAGGTAAA AGATTTATCT GATGAAGAAA GAGAAGCGTT AGCTAAACTT 4 74 0 

GGCGTAAGTG CTGTACGTTT TATTGAGCCA AATAATACAA TTACAGTCGA TACACAAAAT 48 00 

GAATTTGCAA CCAGACCATT AAGTCGAATA GTGATTTCTG hAGGCAGGGC GTGTTTCTCA. 4860 

AACAGTGATG GCGCGACGGT GTGCGTTAAT ATCGCTGATA ACGGGCGGTA GCGGTCAGTA . .4920 

ATTGACAAGG TAGATTTCAT CCTGCAATGA AGTCATTTTA TTTTCGTATT ATTTACTGTG 4980 

TGGGTTAAAG TTCAGTACGG GCTTTACCCA TCTTGTAAAA AATTACGGAG AATACAATAA 5040 

AGTATTTTTA ACAGGTTATT ATTATGAAAA ATATAAAAAG CAGATTAAAA CTCAGTGCAA 5100 

TATCAGTATT GCTTGGCCTG GCTTCTTCAT CATTGTATGC AGAAGAAGCG TTTTTAGTAA 5160 

AAGGCTTTCA GTTATCTGGT GCACTTGAAA CTTTAAGTGA AGACGCCCAA CTGTCTGTAG 5220 

CAAAATCTTT ATCTAAATAC CAAGGCTCGC AAACTTTAAC AAACCTAAAA ACAGCACAGC 5280 

TTGAATTACA GGCTGTGCTA GATAAGATTG AGCXrAAATAA GTTTGATGTG ATATTGCCAC 5340 

AACAAACCAT TACGGATGGC AATATTATGT TTGAGCTAGT CTCGAAATCA GCCGCAGAAA 5400 

GCCAAGTTTT TTATAAGGCG AGCCAGGGTT ATAGTGAAGA AAATATCGCT CGTAGCCTGC 5460 

CATCTTTGAA ACAAGGAAAA GTGTATGAAG ATGGTCGTCA GTGGTTCGAT TTGCGTGAAT 5520 

TCAATATGGC AAAAGAAAAT CCACTTAAAG TCACTCGCGT GCATTACGAG TTAAACCCTA 5580 
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AAAACAAAAC CTCTGATTTG GTAGTTGCAG GTTTTTCGCC TTTTGGCAAA ACGCGTAGCT 564 0 

TTGTTTCCTA TGATAATTTC GGCGCAAGGG AGTTTAACTA TCAACGTGTA AGTCTAGGTT 5 700 

TTGTAAATGC CAATTTGACC GGACATGATG ATGTATTAAA TCTAAACGCA TTGACCAATG. 5 760 

TAAAAGCACC ATCAAAATCT TATGCGGTAG GCATAGGATA TACTTATCCG TTTTATGATA 582 0 

AACACCAATC CTTAAGTCTT TATACCAGCA TGAGTTATGC TGATTCTAAT GATATCGACG 5880 

GCTTACCAAG TGCGATTAAT CGTAAATTAT CAAAAGGTCA ATCTATCTCT GCGAATCTGA 5 94 0 

AATGGAGTTA TTATCTCCCG ACATTTAACC TTGGAATGGA AGACCAGTTT AAAATTAATT 6000 

TAGGCTACAA CTACCGCCAT ATTAATCAAA CATCCGAGTT AAACACCCTG GGTGCAACGA 6 06 0 

AGAAAAAATT TGCAGTATCA GGCGTAAGTG CAGGCATTGA TGGACATATC CAATTTACCC 6120 

CTAAAACAAT CTTTAATATT GATTTAACTC ATCATTATTA CGCGAGTAAA TTACCAGGCT 6180 

CTTTTGGAAT GGAGCGCATT GGCGAAACAT TTAATCGCAG CTATCACATT AGCACAGCCA 6240 

GTTTAGGGTT GAGTCAAGAG TTTGCTCAAG GTTGGCATTT TAGCAGTCAA TTATCGGGTC 63 00 

AGTTTACTCT ACAAGATATA AGTAGCATAG. ATTTATTCTC TGTAACAGGT ACTTATGGCG 6 360 

TCAGAGGCTT TAAATACGGC GGTGCAAGTG GTGAGCGCGG TCTTGTATGG CGTAATGAAT 64 2 0 

TAAGTATGCC AAAATACACC CGCTTTCAAA TCAGCCCTTA TGCGTTTTAT GATGCAGGTC 64 8 0 

AGTTCCGTTA TAATAGCGAA AATGCTAAAA CTTACGGCGA AGATATGCAC ACGGTATCCT 6 54 0 

CTGCGGGTTT AGGCATTAAA ACCTCTCCTA CACAAAACTT AAGCTTAGAT GCTTTTGTTG 6600 

CTCGTCGCTT TGCAAATGCC AATAGTGACA ATTTGAATGG CAACAA7U\AA CGCACAAGCT 6660 

CACCTACAAC CTTCTGGGGT AGATTAACAT TCAGTTTCTA ACCCTGAAAT TTAATCAACT 6720 

GGTAAGCGTT CCGCCTACCA GTTTATAACT ATATGCTTTA CCCGCCAATT TACAGTCTAT 6780 

ACGCAACCCT GTTTTCATCC TTATATATCA AACAAACTAA GCAAACCAAG CAAACCAAGC 6840 

AAACCAAGCA AACCAAGCAA ACCAAGCAAA CCAAGCAAAC CAAGCAAACC AAGCAAACCA . 6900 

AGCAAACCAA GCAAACCAAG .CAAACCAAGC AAACCAAGCA ATGCTAAAAA ACAATTTATA 6960 

TGATAAACTA AAACATACTC CATACCATGG CAATACAAGG GATTTAATAA TATGACAAAA 7020 

GAAAATTTAC AAAGTGTTCC ACAAAATACG ACCGCTTCAC TTGTAGAATC AAACAACGAC 7080 

CAAACTTCCC TGCAAATACT TAAACAACCA CCCAAACCCA ACCTATTACG CCTGGAACAA 7140 

CATGTCGCCA AAAAAGATTA TGAGCTTGCT TGCCGCGAAT TAATGGCGAT TTTGGAAAAA 7200 

ATGGACGCTA ATTTTGGAGG CGTTCACGAT ATTGAATTTG ACGCACCTGC TCAGCTGGCA 7260 

TATCTACCCG AAAAACTACT AATTCATTTT GCCACTCGTC TCGCTAATGC AATTACAACA 7320 

CTCTTTTCCG ACCCCGAATT GGCAATTTCC GAAGAAGGGG CATTAAAGAT GATTAGCCTG 7380 

CAACGCTGGT TGACGCTGAT TTTTGCCTCT TCCCCCTACG TTAACGCAGA CCATATTCTC 7440 

AATAAATATA ATATCAACCC AGATTCCGAA GGTGGCTTTC ATTTAGCAAC AGACAACTCT 7500 

TCTATTGCTA AATTCTGTAT TTTTTACTTA CCCGAATCCA ATGTCAATAT GAGTTTAGAT 7560 

GCGTTATGGG CAGGGAATCA ACAACTTTGT GCTTCATTGT GTTTTGCGTT GCAGTCTTCA 7620 



) 



82 

CGTTTTATTG GTACTGCATC TGCGTTTCAT AAAAGAGCGG TGGTTTTACA GTGGTTTCCT 
AAAAAACTCG CCGAAATTGC TAATTTAGAT GAATTGCCTG CAAATATCCT TCATGATGTA 
TATATGCACT GCAGTTATGA TTTAGCAAAA AACAAGCACG ATGTTAAGCG TCCATTAAAC 
GAACTTGTCC GCAAGCATAT CCTCACGCAA GGATGGCAAG ACCGCTACCT TTACACCTTA 
GGTAAAAAGG ACGGCAAACC TGTGATGATG GTACTGCTTG AACATTTTAA TTCGGGACAT 
TCGATTTATC GCACGCATTC AACTTCAATG ATTGCTGCTC GAGAAAAATT CTATTTAGTC 
GGCTTAGGCC ATGAGGGCGT TGATAACATA GGTCGAGAAG TGTTTGACGA GTTCTTTGAA 
ATCAGTAGCA ATAATATAAT GGAGAGACTG TTTTTTATCC GTAAACAGTG CGAAACTTTC 
CAACCCGCAG TGTTCTATAT GCCTUVGCATT GGCATGGATA TTAGCACGAT TTTTGTGAGC 
AACACTCGGC TTGCCCCTAT TCAAGCTGTA GCCTTGGGTC ATCCrGCCAC TACGCATTCT 
GAATTTATTG ATTATGTCAT CGTAGAAGAT GATTATGTGG GCAGTGAAGA TTGTTTTAGC 
GAAACCCTTT TACGCTTACC CAAAGATGCC CTACCTTATG TACCATCTGC ACTCGCCCCA 
CAAAAAGTGG ATTATGTACT CAGGGAAAAC CCTGAAGTAG TCAATATCGG TATTGCCGCT 
ACCACAATGA AATTAAACCC TGAATTTTTG CTAACATTGC AAGAAATCAG AGATAAAGCT 
AAAGTCAA7UV TACATTTTCA TTTCGCACTT GGACAATCTU^ CAGGCTTGAC ACACCCTTAT 
GTCAAATGGT TTATCGAAAG CTATTTAGGT GACGATGCCA CTGCACATCC CCACGCACCT 
TATCACGATT ATCTGGCAAT ATTGCGTGAT TGCGATATGC TACTAAATCC GTTTCCTTTC 
GGTAATACTA ACGGCATAAT TGATATGGTT ACATTAGGTT TAGTTGGTGT ATGCAAAACG 
GGGGATGAAG TACATGAACA TATTGATGAA GGTCTGTTTA AACGCTTAGG ACTACCAGAA 
TGGCTGATAG CCGACACACG AGAAACATAT ATTGAATGTG CTTTGCGTCT AGCAGAAAAC 
CATCAAGAAC GCCTTGAACT CCGTCGTTAC ATCATAGAAA ACAACGGCTT ACAAAAGCTT 
TTTACAGGCG ACCCTCGTCC ATTGGGCAAA ATACTGCTTA AGAAAACAAA TGAATGGAAG 
CGGAAGCACT TGAGTAAAAA ATAACGGTTT TTTAAAGTAA AAGTGCGGTT AATTTTCAAA 
GCGTTTTAAA AACCTCTCAA AAATCAACCG CACTTTTATC TTTATAACGC TCCCX3CGCGC 
TGACAGTTTA TCTCTTTCTT AAAATACCCA TAAAATTCTG GCAATAGTTG GGTAATCAAA 
TTCAATTGTT GATACGGCAA ACTAAAGACG GCGCGTTCTT CGGCAGTCAT C 
(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9323 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOIXDGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:6: 
CGCCACTTCA ATTTTGGATT GTTGAAATTC AACTAACCAA AAAGTGCGGT TAAAATCTGT 



60 



o 



83 



GGAGAAAATA GGTTGTAGTG AAGAACGAGG TAATTGTTCA AAAGGATAAA GCTCTCTTAA 
TTGGGCATTG GTTGGCGTTT CTTTTTCGGT TAATAGTAAA TTATATTCTG GACGACTATG 
CAATCCACCA ACAACTTTAC CGTTGGTTTT AAGCGTTAAT GTAAGTTCTT GCTCTTCTTG 
GCGAATACGT AATCCCATTT TTTGTTTAGC AAGAAAATGA TCGGGATAAT CATAATAGGT 
GTTGCCCAAA AATAAATTTT GATGTTCTAA AATCATAAAT TTTGCAAGAT ATTGTGGCAA 
TTCAATACCT ATTTGTGGCG AAATCGCCAA TTTTAATTCA ATTTCTTGTA GCATAATATT 
TCCCACTCAA ATCAACTGGT TAAATATACA AGATAATAAA AATAAATCAA GATTTTTGTG 
ATGACAAACA ACAATTACAA CACCTTTTTT GCAGTCTATA TGCAAATATT TTAAAAAAAT 
AGTATAAATC CGCCATATAA AATGGTATAA TCTTTCATCT TTCATCTTTC ATCTTTCATC 
TTTCATCTTT CATCTTTCAT CTTTCATCTT TCATCTTTCA TCTTTCATCT TTCATCTTTC 
ATCTTTCATC TTTCATCTTT CACATGAAAT GATGAACCGA GGGAAGGGAG GGAGGGGCAA 
GAATGAAGAG GGAGCTGAAC GAACGCAAAT GATAAAGTAA TTTAATTGTT CAACTAACCT 
TAGGAGAAAA TATGAACAAG ATATATCGTC TCAAATTCAG CAAACGCCTG AATGCTTTGG 
TTGCTGTGTC TGAATTGGCA CGGGGTTGTG ACCATTCCAC AGAAAAAGGC AGCGAAAAAC 
CTGCTCGCAT GAAAGTGCGT _,CACTTAGCGT TAAAGCCACT TTCCGCTATG TTACTATCTT 
TAGGTGTAAC ATCTATTCCA CAATCTGTTT TAGCAAGCGG CAATTTAACA TCGACCAAAA 
TGAAATGGTG CAGTTTTTAC AAGAAAACAA GTAATAAAAC CATTATCCGC AACAGTGTTG 
ACGCTATCAT TAATTGGAAA CAATTTAACA TCGACCAAAA TGAAATCGTG CAGTTTTTAC 
AAGAAAACAA CAACTCCGCC GTATTCAACC GTGTTACATC TAACCAAATC TCCCAATTAA 
AAGGGATTTT AGATTCTAAC GGACAAGTCT TTTTAATCAA CCCAAATGGT ATCACAATAG 
GTAAAGACGC AATTATTAAC ACTAATGGCT TTACGGCTTC TACGCTAGAC ATTTCTAACG 
AAAACATCAA GGCGCGTAAT TTCACCTTCG AGCAAACCAA AGATAAAGCG CTCGCTGAAA ' 
TTGTGAATCA CGGTTTAATT ACTGTCGGTA AAGACGGCAG TGTAAATCTT ATTGGTGGCA 
AAGTGAAAAA CGAGGGTGTG ATTAGCGTAA ATGGTGGCAG CATTTCTTTA CTCGCAGGGC 
AAA^TCAC CATCAGCGAT ATAATAAACC CAACCATTAC TTACAGCATT GCCGCGCCTG 
AAAATGAAGC GGTCAATCTG GGOGATATTT TTGCCAAAGG CGGTAACATT AATGTCCGTC 
CTGCCACTAT TCGAAACCAA GGTAAACTTT CTGCTGATTC TGTAAGCAAA GATAAAAGCG 
GCAATATTGT TCTTTCCGCC AAAGAGGGTG AAGCGGAAAT TGGCGGTGTA ATTTCCGCTC 
AAAATCAGCA AGCTAAAGGC GGCAAGCTGA TGATAAAGTC CGATAAAGTC ACATTAAAAA 
CAGGTGCAGT TATCGACCTT TCAGGTAAAG AAGGGGGAGA AACTTACCTT GGCGGTGACG 
AGCGCGGCGA AGGTAAAAAC GGCATTCAAT TAGCAAAGAA AACCTCTTTA GAAAAAGGCT 
CAACCATCAA TGTATCAGGC AAAfiAAAAAG GCGGACX3CGC TATTGTGTGG fSGCGATATTG 
CGTTAATTGA CGGCAATATT AAOGCTCAAG GTAGTGGTGA TATCGCTAAA ACOGGTGGTT 
TTGTGGAGAC ATCGGGGCAT TATTTATCCA TTGACAGCAA TGCAATTGTT AAAACAAAAG 
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AGTGGTTGCT AGACCCTGAT GATGTAACAA TTGAAGCCGA AGACCCCCTT CGCAATAATA 
CCGGTATAAA TGATGAATTC CCAACAGGCA CCGGTGAAGC AAGCGACCCT AAAAAAAATA 
G-GAACTCAA AACAACGCTA ACCAATACAA CTATTTC;^^^^ TTATCTGAAA AACGCCTGGA 
CAATGAATAT AACGGCATCA AGAAAACTTA CCGTTAATAG CTCAATCAAC ATCGGAAGCA 
ACTCCCACTT AATTCTCCAT AGTAAAGGTC AGCGTGGCGG AGGCGTTCAG ATTGATGGAG 
ATATTACTTC TAAAGGCGGA AATTTAACCA TTTATTCTGG CGGATGGGTT GATGTTCATA 
AAAATATTAC GCTTGATCAG GGTTTTTTAA ATATTACCGC CGCTTCCGTA GCTTTTGAAG 
GTGGAAATAA CAAAGCACGC GACGCGGCAA ATGCTAAAAT TGTCGCCCAG GGCAGTGTAA 
CCATTACAGG AGAGGGAAAA GATTTCAGGG CTAACAACGT ATCTTTAAAC GGAACGGGTA 264 0 

AAGGTCTGAA TATCATTTCA TCAGTGAATA ATTTAACCCA CAATCTTAGT GGCACAATTA 2 700 

ACATATCTGG GAATATAACA ATTAACCAAA CTACGAGAAA GAACACCTCG TATTGGCT^ 
CCAGCCATGA TTCGCACTGG AACGTCAGTG CTCTTTU^TCT AGAGACAGGC GCAAATTTTA 
CCTTTATTAA ATACATTTCA AGCAATAGCA AAGGCTTAAC AACACAGTAT AGAAGCTCTG 
CAGGGGTGAA TTTTAACGGC GTAAATGGCA ACATGTCATT CAATCTCAAA GAAGGAGCGA 
AAGTTAATTT CAAATTAAAA CCAAACGAGa' ACATGAACAC AAGCAAACCT TTACCAATTC 
GGTTTTTAGC CAATATCACA GCCACTGGTG GGGGCTCTGT TTTTTTTGAT ATATATGCCA 
ACCATTCTGG CAGAGGGGCT GAGTTAAAAA TGAGTGAAAT TAATATCTCT AACGGCGCTA 
ATTTTACCTT AAATTCCCAT GTTCGCGGCG ATGACGCTTT TAAAATCAAC AAAGACTTAA 
CCATAAATGC AACCAATTCA AATTTCAGCC TCAGACAGAC GAAAGATGAT TTTTATGACG 
GGTACGCACG CAATGCCATC AATTCAACCT ACAACATATC CATTCTGGGC GGTAATGTCA 
CCCTTGGTGG ACAAAACTCA AGCAGCAGCA TTACGGGGAA TATTACTATC GAGAAAGCAG 
CAAATGTTAC GCTAGAAGCC AATAACGCCC CTAATCAGCA AAACATAAGG GATAGAGTTA 
TAAAACTTGG CAGCTTGCTC GTTAATGGGA GTTTAAGTTT AACTGGCGAA AATGCAGATA 
TTAAAGGCAA TCTCACTATT TCAGAAAGCX3 CCACTTTTAA AGGAAAGACT AGAGATACCC 
TAAATATCAC CGGCAATTTT ACCAATAATG GCACTGCCGA AATTAATATA ACACAAGGAG 
TGGTAAAACT TGGCAATGTT ACCAATGATG GTGATTTAAA CATTACCACT CACGCTAAAC 
GCAACCAAAG AAGCATCATC GGCGGAGATA TAATCAACAA AAAAGGAAGC TTAAATATTA 
CAGACAGTAA TAATGATGCT GAAATCCAAA TTGGCGGCAA TATCTCGCAA AAAGAAGGCA 
ACCTCACGAT TTCTTCCGAT AAAATTAATA TCACCAAACA GATAACAATC AAAAAGGGTA 
TTGATGGAGA GGACTCTAGT TCAGATGCGA CAAGTAATGC CAACCTAACT ATTAAAACCA 3 900 
AAGAATTGAA ATTGACAGAA GACCTAAGTA TTTCAGGTTT CAATAAAGCA GAGATTACAG 3960 
CCAAAGATGG TAGAGATTTA ACTATTGGCA ACAGTAATGA CX;GTAACAGC GOTGCXTGAAG 4020 
CCAAAACAGT AACTTTTAAC AATGTTAAAG ATTCAAAAAT CTCTGCTGAC GGTCACAATG 4080 
TGACACTAAA TAGCAAAGTG AAAACATCTA GCAGCAATGG CGGACGTGAA AGCAATAGCG 4140 
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ACAACGATAC CGGCTTAACT ATTACTGCAA AAAATGTAGA AGTAAACAAA GATATTACTT 4 2 00 

CTCTCAAAAC AGTAAATATC ACCGCGTCGG AAAAGGTTAC CACCACAGCA GGCTCGACCA 
TTAACGCAAC AAATGGCAAA GCAAGTATTA CAACCAAAAC AGGTGATATC AGCGGTACGA 
TTTCCGGTAA CACGGTAAGT GTTAGCGCGA CTGGTGATTT AACCACTAAA TCCGGCTCAA 
AAATTGAAGC GAAATCGGGT GAGGCTAATG TAACAAGTGC AACAGGTACA ATTGGCGGTA 
CAATTTCCGG TAATACGGTA AATGTTACGG CAAACGCTGG CGATTTAACA GTTGGGAATG 
GCGCAGAAAT TAATGCGACA GAAGGAGCTG CAACCTTAAC CGCAACAGGG AATACCTTGA 
CTACTGAAGC CGGTTCTAGC ATCACTTCAA CTAAGGGTCA GGTAGACCTC TTGGCTCAGA 4 62 0 

ATGGTAGCAT CGCAGGAAGC ATTAATGCTG CTAATGTGAC ATTAAATACT ACAGGCACCT 4 68 0 

TAACCACCGT GGCAGGCTCG GATATTAAAG CAACCAGCGG CACCTTGGTT ATTAACGCAA 4 74 0 

AAGATGCTAA GCTAAATGGT GATGCATCAG GTGATAGTAC AGAAGTGAAT GCAGTCAACG 4 800 

ACTGGGGATT TGGTAGTGTG ACTGCGGCAA CCTCAAGCAG TGTGAATATC ACTGGGGATT 4 86 0 

TAAACACAGT AAATGGGTTA AATATCATTT CGAAAGATGG TAGAAACACT GTGCGCTTAA 
GAGGCAAGGA AATTGAGGTG AAATATATCC AGCCAGGTGT AGCAAGTGTA GAAGAAGTAA 
TTGAAGCGAA ACGCGTCCTT GAAAAAGTAA AAGATTTATC TGATGAAGAA AGAGAAACAT 
TAGCTAAACT TGGTGTAAGT GCTGTACGTT TTGTTGAGCC AAATAATACA ATTACAGTCA 
ATACACAAAA TGAATTTACA ACCAGACCGT CAAGTCAAGT GATAATTTCT GAAGGTAAGG 
CGTGTTTCTC AAGTGGTAAT GGCGCACGAG TATGTACCAA TGTTGCTGAC GATGGACAGC 
CGTAGTCAGT AATTGACAAG GTAGATTTCA TCCTGCAATG AAGTCATTTT ATTTTCGTAT 
TATTTACTGT GTGGGTTAAA GTTCAGTACG GGCTTTACCC ATCTTGTAAA AAATTACGGA 
GAATACAATA AAGTATTTTT AACAGGTTAT TATTATGAAA AATATAAAAA GCAGATTAAA 
ACTCAGTGCA ATATCAGTAT TGCTTGGCCT GGCTTCTTCA TCATTGTATG CAGAAGAAGC 
<2TTTTTAGTA AAAGGCTTTC AGTTATCTGG TGCACTTGAA ACTTTAAGTG AAGACGCCCA 
ACTGTCTGTA GCAAAATCTT TATCTAAATA CCAAGGCTCG CAAACTTTAA CAAACCTAAA 
AACAGCACAG CTTGAATTAC AGGCTGTGCT AGATAAGATT GAGCCAAATA AATTTGATGT 5640 
GATATTGCCG CAACAAACCA TTACGGATGG CAATATCATG TTTGAGCTAG TCTCGAAATC 5700 
AGCCGCAGAA AGCCAAGTTT TTTATAAGGC GAGCCAGGGT TATAGTGAAG AAAATATCGC 5760 
TCGTAGCCTG CCATCTTTGA AACAAGGAAA . AGTGTATGAA GATGGTCGTC AGTGGTTCGA 5820 
TTTGCGTGAA TTTAATATGG CAAAAGAAAA CCCGCTTAAG GTTACCCX5TG TACATTACGA 5880 
ACTAAACCCT AAAAACAAAA CCTCTAATTT GATAATTGCG GGCTTCTCGC CTTTTGGTAA 5940 
AACGCGTAGC TTTATTTCTT ATGATAATTT CGGCGGGAGA GAGTTTAACT ACCAACGTGT 6000 
AAGCTTGGGT TTTGTTAATG CCAATTTAAC TGGTCATGAT GATGTCTTAA TTATACCAGT 6060 
ATGAGTTATG CTGATTCTAA TGATATCGAC GGCTTACCAA GTGCGATTAA TCGTAAATTA 6120 
TCAAAAGGTC AATCTATCTC TGCGAATCTG AAATGGAGTT ATTATCTCCC AACATTTAAC 6180 
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CTTGGCATGG AAGACCAATT TAAAATTAAT TTAGGCTACA ACTACCGCCA TATTAATCAA 624 0 

ACCTCCGCGT TAAATCGCTT GGGTGAAACG AAGAAAAAAT TTGCAGTATC AGGCGTAAGT 6 3 00 

GCAGGCATTG ATGGACATAT CCAATTTACC CCTAAAACAA TCTTTAATAT TGATTTAACT 63 60 

CATCATTATT ACGCGAGTAA ATTACCAGGC TCTTTTGGAA TGGAGCGCAT TGGCGAAACA 64 2 0 

TTTAATCGCA GCTATCACAT TAGCACAGCC AGTTTAGGGT TGAGTCAAGA GTTTGCTCAA 64 80 

GGTTGGCATT TTAGCAGTCA ATTATCAGGT CAATTTACTC TACAAGATAT TAGCAGTATA 6 54 0 

GATTTATTCT CTGTAACAGG TACTTATGGC GTCAGAGGCT TTAAATACGG CGGTGCAAGT 66 00 

GGTGAGCGCG GTCTTGTATG GCGTAATGAA TTAAGTATGC CAAAATACAC CCGCTTCCAA 66 6 0 

ATCAGCCCTT ATGCGTTTTA TGATGCAGGT CAGTTCCGTT ATAATAGCGA AAATGCTAAA 6 72 0 

ACTTACGGCG AAGATATGCA CACGGTATCC TCTGCGGGTT TAGGCATTAA AACCTCTCCT 678 0 

ACACAAAACT TAAGCCTAGA TGCTTTTGTT GCTCGTCGCT TTGCAAATGC CAATAGTGAC 6840 

AATTTGAATG GCAACAAAAA ACGCACAAGC TCACCTACAA CCTTCTGGGG GAGATTAACA 6 900 

TTCAGTTTCT AACCCTGAAA TTTAATCAAC TGGTAAGCGT TCCGCCTACC AGTTTATAAC 6960 

TATATGCTTT ACCCGCCAAT TTACAGTCTA TAGGCAACCC TGTTTTTACC CTTATATATC 702 0 

AAATAAACAA GCTAAGCTGA GCTAAGCAAA CCAAGCAAAC TCAAGCAAGC CAAGTAATAC 7080 

TAAAAAAAaA ATTTATATGA TAAACTAAAG TATACTCCAT GCCATGGCGA TACAAGGGAT 7140 

TTAATAATAT GACAAAAGAA AATTTGCAAA ACGCTCCTCA AGATGCGACC GCTTTACTTG 72 00 

CGGAATTAAG CAACAATCAA ACTCCCCTGC GAATATTTAA ACAACCACGC AAGCCCAGCC 7260 

TATTACGCTT GGAACAACAT ATCGCAAAAA AAGATTATGA GTTTGCTTGT CGTGAATTAA 7320 

TGGTGATTCT GGAAAAAATG GACGCTAATT TTGGAGGCGT TCACGATATT GAATTTGACG 738 0 

CACCCGCTCA GCTGGCATAT CTACCCGAAA 7VATTACTAAT TTATTTTGCC ACTCGTCTCG 7440 

CTAATGCAAT TACAACACTC TTTTCCGACC CCGAATTGGC AATTTCTGAA GAAGGGGCGT * 7500 

TAAAGATGAT TAGCCTGCAA CGCTGGTTGA CGCTGATTTT TGCCTCTTCC CCCTACGTTA 7560 

ACGCAGACCA TATTCTCAAT AAATATAATA TCAACCCAGA TTCCGAAGGT GGCTTTCATT 7620 

TAGCAACAGA CAACTCTTCT ATTGCTAAAT TCTGTATTTT TTACTTACCC GAATCCAATG 7680 

TCAATATGAG TTTAGATGCG TTATGGGCAG GGAATCAACA ACTTTGTGCT TCATTGTGTT 7740 

TTGCGTTGCA GTCTTCACGT TTTATTGGTA CCGCATCTGC GTTTCATAAA AGAGCGGTGG 7800 

TTTTACAGTG GTTTCCTAAA AAACTCX3CCG AAATTGCTAA TTTAGATGAA TTGCCTGCAA 7860 

ATATCCTTCA TGATGTATAT ATGCACTGCA GTTATGATTT AGCAAAAAAC AAGCACGATG 7920 

TTAAGCGTCC ATTAAACGAA CTTGTCCGCA AGCATATCCT CACGCAAGGA TCSGCAAGACC 7980 

GCTACCTTTA CACCTTAGGT AAAAAGGACG GCAAACCTGT GATGATGGTA CTGCTTGAAC 8040 

ATTTTAATTC GGGACATTCG ATTTATCGTA CACATTCAAC TTCAATGATT GCTGCTCGAG 8100 

AAAAATTCTA TTTAGTCGGC TTAGGCCATG AGGGCGTTGA TAAAATAGGT CGAGAAGTGT 8160 

TTGAOGAGTT CTTTGAAATC AGTAGCAATA ATATAATGGA GAGACTGTTT TTTATCCGTA 8220 



(2) INFORMATION FOR SEQ ID NO : 7 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4794 base pairs 

(B) TYPE: nucleic acid 

(C) STI^ANDEDNESS : single 

(D) TOPOLOGY: linear 
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AACAGTGCGA AACTTTCCAA CCCGCAGTGT TCTATATGCC AAGCATTGGC ATGGATATTA 8 2 80 

CCACGATTTT TGTGAGCAAC ACTCGGCTTG CCCCTATTCA AGCTGTAGCC CTGGGTCATC 8 3 40 

CTGCCACTAC GCATTCTGAA TTTATTGATT ATGTCATCGT AGAAGATGAT TATGTGGGCA 8 4 00 

GTGAAGATTG TTTCAGCGAA ACCCTTTTAC GCTTACCCAA AGATGCCCTA CCTTATGTAC 8 4 60 

CTTCTGCACT CGCCCCACAA AAAGTGGATT ATGTACTCAG GGAAAACCCT GAAGTAGTCA 8 52 0 

ATATCGGTAT TGCCGCTACC ACAATGAAAT TAAACCCTGA ATTTTTGCTA ACATTGCAAG 
AAATCAGAGA TAAAGCTAAA GTCAAAATAC ATTTTCATTT CGCACTTGGA CAATCAACAG 
GCTTGACACA CCCTTATGTC AAATGGTTTA TCGAAAGCTA TTTAGGTGAC GATGCCACTG 
CACATCCCCA CGCACCTTAT CACGATTATC TGGCAATATT GCGTGATTGC GATATGCTAC 
TAAATCCGTT TCCTTTCGGT AATACTAACG GCATAATTGA TATGGTTACA TTAGGTTTAG 8 820 

TTGGTGTATG CAAAACGGGG GATGAAGTAC ATGAACATAT TGATGAAGGT CTGTTTAAAC 8 880 

GCTTAGGACT ACCAGAATGG CTGATAGCCG ACACACGAGA AACATATATT GAATGTGCTT 8 94 0 

TGCGTCTAGC AGAAAACCAT CAAGAACGCC TTGAACTCCG TCGTTACATC ATAGAAAACA 
ACGGCTTACA AAAGCTTTTT ACAGGCGACC CTCGTCCATT GGGCAAAATA CTGCTTAAGA 
AAACAAATGA ATGGAAGCGG AAGCACTTGA GTAAAAAATA ACGGTTTTTT AAAGTAAAAG 912 0 

TGCGGTTAAT TTTCAAAGCG TTTTAAAAAC GTCTCAAAAA TCAACCGCAC TTTTATCTTT 9180 
ATAACGATCC CGCACGCTGA CAGTTTATCA GCCTCCCGCC ATAAAACTCC GCCTTTCATG 9240 
GCGGAGATTT TAGCCAAAAC TGGCAGAAAT TTVAAGGCTAA AATCACCAAA TTGCACCACA 
AAATCACCAA TACCCACAAA AAA 



9000 
9060 



9300 
9323 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

ATGAACAAGA TATATCGTCT CAAATTCAGC AAACGCCTGA ATGCTTTGGT TGCTGTGTCT 60 

GAATTGACAC GGGGTTGTGA CCATTCCACA GAAAAAGGCA GTGAAAAACC TGTTCGTACG 120 

AAAGTACGCC ACTTGGCGTT AAAGCCACTT TCCGCTATAT TGCTATCTTT GGGCATGGCA 180 

TCCATTCCGC AATCTGTTTT AGCGAGCGGT TTACAGGGAA TGAGCGTCGT ACACGGTACA 240 

GCAACCATGC AAGTAGACGG CAATAAAACC ACTATCCGTA ATAGCGTCAA TGCTATCATC 300 

AATTGGAAAC AATTTAACAT TGACCAAAAT GAAATGGTGC AGTTTTTACA AGAAAGCAGC 360 

AACTCTGCCG TTTTCAACCG TGTTACATCT GACCAAATCT CCCAATTAAA AGGGATTTTA 420 
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GATTCTAACG GACAAGTCTT TTTAATCAAC CCAAATGGTA TCACAATAGG TAAAGACGCA 


480 


ATTATTAACA CTAATGGCTT TACTGCTTCT ACGCTAGACA TTTCTAACGA AAACATC7UVG 


540 


GCGCGTAATT TCACCCTTGA GCAAACCAAG GATAAAGCAC TCGCTGAAAT CGTGAATCAC 


600 


GGTTTAATTA CCGTTGGTAA AGACGGTAGC GTAAACCTTA 


. TTGGTGGCAA 


. AGTGAAAAAC 


660 


GAGGGCGTGA TTAGCGTAAA TGGCGGTAGT ATTTCTTTAC 


TTGCAGGGCA 


AAAAATCACC 


720 


ATCAGCGATA TAATAAATCC AACCATCACT TACAGCATTG 


CTGCACCTGA 


AAACGAAGCG 


780 


ATCAATCTGG GCGATATTTT TGCCAAAGGT GGTAACATTA 


ATGTCCGCGC 


TGCCACTATT 


840 


CGCAATAAAG GTAAACTTTC TGCCGACTCT GTAAGCAAAG 


ATAAAAGTGG 


TAACATTGTT 


900 


CTCTCTGCCA AAGAAGGTGA AGCGGAAATT GGCGGTGTAA 


TTTCCGCTCA 


AAATCAGCAA 


960 


GCCAAAGGTG GTAAGTTGAT GATTACAGGC GATAAAGTTA 


CATTGAAAAC 


GGGTGCAGTT 


1020' 


ATCGACCTTT CGGGTAAAGA AGGGGGAGAA ACTTATCTTG 


GCGGTGACGA 


GCGTGGCGAA 


1080 


GGTAAAAACG GCATTCAATT AGCAAAGAAA ACCACTTTAG 


AAAAAGGCTC 


AACAATTAAT 


1140 


GTGTCAGGTA AAGAAAAAGG TGGGCGCGCT ATTGTATGGG 


GCGATATTGC 


GTTAATTGAC 


1200 


GGCAATATTA ATGCCCAAGG TAAAGATATC GCTAAAACTG 


GTGGTTTTGT 


GGAGACGTCG 


1260 


GGGCATTACT TATCCATTGA TGATAACGCA ATTGTTAAAA 


CAAAAGAATG 


GCTACTAGAC 


1320 


CCAGAGAATG TGACTATTGA AGCTCCTTCC GCTTCTCGCG 


TCGAGCTGGG 


TGCCGATAGG 


1380 


AATTCCCACT CGGCAGAGGT GATAAAAGTG ACCCTAAAAA 


AAAATAACAC 


CTCCTTGACA 


1440 


ACACTAACCA ATACAACCAT TTCAAATCTT CTGAAAAGTG 


CCCACGTGGT 


GAACATAACG 


1500 


GCAAGGAGAA AACTTACCGT TAATAGCTCT ATCAGTATAG 


AAAGAGGCTC 


CCACTTAATT 


1560 


CTCCACAGTG AAGGTCAGGG CGGTCAAGGT GTTCAGATTG 


ATAAAGATAT 


TACTTCTGAA 


1620 


GGCGGAAATT TAACCATTTA TTCTGGCGGA TGGGTTGATG 


TTCATAAAAA 


TATTACGCTT 


1680 


GGTAGCGGCT TTTTAAACAT CACAACTAAA G7VAGGAGATA 


TCGCCTTCGA 


AGACAAGTCT* 


1740 


GGACGGAACA ACCTAACCAT TACAGCCCAA GGGACCATCA 


CCTCAGGTAA 


TAGTAACGGC 


1800 


TTTAGATTTA ACAACGTCTC TCTAAACAGC CTTGGCGGAA 


AGCTGAGCTT 


TACTGACAGC 


1860 


AGAGAGGACA GAGGTAGAAG AACTAAGGGT AATATCTCAA 


ACAAATTTGA 


CGGAACGTTA 


1920 


AACATTTCCG GAACTGTAGA TATCTCAATG AAAGCACCCA 


AAGTCAGCTG 


GTTTTACAGA 


1980 


GACAAAGGAC GCACCTACTG GAACGTAACC ACTTTAAATG 


TTACCTCGGG 


TAGTAAATTT 


2040 


AACCTCTCCA TTGACAGCAC AGGAAGTGGC TCAAGAGGTC 


C7UVGCATACG 


CAATGCAGAA 


2100 


TTAAATGGCA TAACATTTAA TAAAGCCACT TTTAATATCG 


CACAAGGCTC 


AACAGCTAAC 


2160 


TTTAGCATCA AGGCATCAAT AATGCCCTTT AAGAGTAACG 


CTAACTACGC 


ATTATTTAAT 


2220 


GAAGATATTT CAGTCTCAGG GGGGGGTAGC CTTAATTTCA 


AACTTAACGC 


CTCATCTAGC 


2280 


AACATACAAA CCCCTGGCGT AATTATAAAA TCTCAJ\AACT 


TTAATGTCTC 


AGGAGGGTCA 


2340 


ACTTTAAATG TCAAGGCTGA AGGTTCAACA GAAACCGCTT 


TTTCAATAGA AAATGATTTA 


2400 


AACTTAAACG CCACCGGTGG CAATATAACA ATCAGACAAG 


TCX3AGGGTAC 


CGATTCACX3C 


2460 
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GTCAACAAAG GTGTCGCAGC CAAAAAAAAC ATAACTTTTA AAGGGGGTAA TATCACCTTC 2 520 

GGCTCTCAAA AAGCCACAAC AGAAATCAAA GGCAATGTTA CCATCAATAA AAACACTAAC 2 580 

GCTACTCTTT GTGGTGCGAA TTTTGCCGAA AJ^.CAAATCGC CTTTAAATAT AGCAGGAAAT 2640 

GTTATTAATA ATGGCAACCT TACCACTGCC GGCTCCATTA TCAATATAGC CGGAAATCTT 2700 

ACTGTTTCAA AAGGCGCTAA CCTTCAAGCT ATAACAAATT ACACTTTTAA TGTAGCCGGC 2 7 60 

TCATTTGACA ACAATGGCGC TTCAAACATT TCCATTGCCA GAGGAGGGGC TAAATTTAAA 2 82 0 

GATATCAATA ACACCAGTAG CTTAAATATT ACCACCAACT CTGATACCAC TTACCGCACC 2880 

ATTATAAAAG GCAATATATC CAACAAATCA GGTGATTTGA ATATTATTGA. TAAAAAAAGC 2 94 0 

GACGCTGAAA TCCAAATTGG CGGCAATATC TCACAAAAAG AAGGCAATCT CACAATTTCT 3 000 

TCTGATAAAG TAAATATTAC CAATCAGATA ACAATCAAAG CAGGCGTTGA AGGGGGGCGT 3 060 

TCTGATTCAA GTGAGGCAGA AAATGCTAAC CTAACTATTC AAACCAAAGA GTTAAAATTG 312 0 

GCAGGAGACC TAAATATTTC AGGCTTTAAT AAAGCAGAAA TTACAGCTAA AAATGGCAGT 318 0 

GATTTAACTA TTGGCAATGC TAGCGGTGGT AATGCTGATG CTAAAAAAGT GACTTTTGAC 3 240 

AAGGTTAAAG ATTCAAAAAT CTCGACTGAC GGTCACAATG TAACACTAAA TAGCGAAGTG 3 3 00 

AAAACGTCTA ATGGTAGTAG CAATGCTGGT AATGATAACA GCACCGGTTT AACCATTTCC 3 3 60 

GCAAAAGATG TAACGGTAAA CAATAACGTT ACCTCCCACA AGACAATAAA TATCTCTGCC 3 420 

GCAGCAGGAA ATGTAACAAC CAAAGAAGGC ACAACTATCA ATGCAACCAC AGGCAGCGTG 34 80 

GAAGTAACTG CTCAAAATGG TACAATTAAA GGCAACATTA CCTCGCAAAA TGTAACAGTG 3 54 0 

ACAGCAACAG AAAATCTTGT TACCACAGAG AATGCTGTCA TTAATGCAAC CAGCGGCACA 3600 

GTAAACATTA GTACAAAAAC AGGGGATATT AAAGGTGGAA TTGAATCAAC TTCCGGTAAT 3660 

GTAAATATTA CAGCGAGCGG CAATACACTT AAGGTAAGTA ATATCACTGG TCAAGATGTA 3720 

ACAGTAACAG CGGATGCAGG AGCCTTGACA ACTACAGCAG GCTCAACCAT TAGTGCGACA- 3780 

ACAGGCAATG CAAATATTAC AACCAAAACA GGTGATATCA ACGGTAAAGT TGTU^TCCAGC 3840 

TCCGGCTCTG TAACACTTGT TGCAACTGGA GCAACTCTTG CTGTAGGTAA TATTTCAGGT 3 900 

AACACTGTTA CTATTACTGC GGATAGCGGT AAATTAACCT CCACAGTAGG TTCTACAATT 3960 

AATGGGACTA ATAGTGTAAC CACCTCAAGC CAATCAGGCG ATATTGAAGG TACAATTTCT 4020* 

GGTAATACAG TAAATGTTAC AGCAAGCACT GGTGATTTAA CTATTGGAAA TAGTGCAAAA 4080 

GTTGAAGCGA AAAATGGAGC TGCAACCTTA ACTGCTGAAT CAGGCAAATT AACCACCCAA 4140 

ACAGGCTCTA GCATTACCTC AAGCAATGGT CAGACAACTC TTACAGCCAA GGATAGCAGT 4200 

ATCGCAGGAA ACATTAATGC TGCTAATGTG ACX5TTAAATA CCACAGGCAC TTTAACTACT 4260 

ACAGGGGATT CAAAGATTAA CGCAACCAGT GGTACCTTAA CAATCAATGC AAAAGATGCC 4320 

AAATTAGATG GTGCTGCATC AGGTGACCGC ACAGTAGTAA ATGCAACTAA CGCAAGTGGC 4380 

TCTGGTAACG TGACTGCGAA AACCTCAAGC AGCGTGAATA TCACCGGGGA TTTAAACACA 4440 

ATAAATGGGT TAAATATCAT TTCX3GAAAAT GGTAGAAACA CTGTGCGCTT AAGAGGCAAG 4500 
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GAAATTGATG 


TGAAATATAT 


CCAACCAGGT 


GTAGCAAGCG 


TAGAAGAGGT 


AATTGAAGCG 


4560 


AAACGCGTCC 


TTGAGAAGGT 


AAAAGATTTA 


TCTGATGAAG 


AAAGAGAAAC 


ACTAGCCAAA 


4620 


CTTGGTGTAA 


GTGCTGTACG 


TTTCGTTGAG 


CCAAATAATG 


CCATTACGGT 


TAATACACAA • 


4680 


AACGAGTTTA 


CAACCAAACC 


ATCAAGTCAA 


GTGACAATTT 


CTGAAGGTAA 


GGCGTGTTTC 


4740 


TCAAGTGGTA 


ATGGCGCACG 


AGTATGTACC 


AATGTTGCTG 


ACGATGGACA 


GCAG 


4794 



(2) INFORMATION FOR SEQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4803 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 : 



ATGAACAAGA 


TATATCGTCT 


C AAATTCAG C 


AAACG CCTG A 


ATG CTTTGGT 


TGCTGTGTCT 


60 


GAATTGACAC 


GGGGTTGTGA 


CCATTCCACA 


GAAAAAGGCA 


GTGAAAAACC 


TGTTCGTACG 


120 


AAAGTACGCC 


ACTTGGCGTT 


AAAGCCACTT 


TCCGCTATAT 


TGCTATCTTT 


GGGCATGGCA 


180 


TCCATTCCGC 


AATCTGTTTT 


AGCGAGCGGT 


TTACAGGGAA 


TGAGCGTCGT 


ACACGGTACA 


240 


GCAACCATGC 


AAGTAGACGG 


CAATAAAACC 


ACTATCCGTA 


ATAGCGTCAA 


TGCTATCATC 


300 


7VATTGGAAAC 


AATTTAACAT 


TGACCAAAAT 


GAAATGGTGC 


AGTTTTTACA 


AGAAAGCAGC 


360 


AACTCTGCCG 


TTTTCAACCG 


TGTTACATCT 


GACCAAATCT 


CCCAATTAAA 


AGGGATTTTA 


420 


GATTCTAACG 


G7vCAAGTCTT 


TTTAATCAAC 


CCAAATGGTA 


TCACAATAGG 


TAAAGACGCA 


480 


ATTATTAACA 


CTAATGGCTT 


TACTGCTTCT 


ACGCTAGACA 


TTTCTAACGA 


AAACATCAAG 


540 


GCGCGTAATT 


TCACCCTTGA 


GCAAACCAAG 


GATAAAGCAC 


TCGCTGAAAT 


CGTGAATCAC 


600 


GGTTTAATTA 


CCGTTGGTAA 


AGACGGTAGC 


GTAAACCTTA 


TTGGTGGCAA 


AGTGAAAAAC 


660 


GAGGGCGTGA 


TTAGCGTAAA 


TGGCGGTAGT 


ATTTCTTTAC 


TTGCAGGGCA 


AAAAATCACC 


720 


ATCAGCGATA 


TAATAAATCC 


AACCATCACT 


TACAGCATTG 


CTGCACCTGA 


AAACGAAGCG 


780 


ATCAATCTGG 


GCGATATTTT 


TGCCAAAGGT 


GGTAACATTA 


ATGTCCGCX3C 


TGCCACTATT 


840 


CX3CAATAAAG 


GTAAACTTTC 


TGCCGACTCT 


GTAAGCAAAG 


ATAAAAGTGG 


TAACATTGTT 


900 


CTCTCTGCCA 


AAGAAGGTGA 


AGCGGAAATT 


GGCGGTGTAA 


TTTCCGCTCA 


AAATCAGCAA 


960 


GCCAAAGGTG 


GTAAGTTGAT 


GATTACAGGT 


GATAAAGTCA 


CATTAAAAAC 


AGGTGCAGTT 


1020 


ATCGACCTTT 


CAGGTAAAGA 


AGGGGGAGAG 


ACTTATCTTG 


GCGGTGATGA 


GCGTGGCGAA 


1080 


GGTAAAAAT6 


GTATTCAATT 


AGCGAAGAAA 


ACCTCTTTAG 


AAAAAGGCTC 


GACAATTAAT 


1140 


GTATCAGGCA 


AAGAAAAAGG 


CGGGCGCGCT 


ATTGTATGGG 


GCGATATTGC 


ATTAATTAAT 


1200 


GGTAACATTA 


ATGCTCAAGG 


TAGCGATATT 


GCTAAAACTG 


GCGGCTTTGT 


GGAAACATCA 


126Q 
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GGACATGACT TATCCATTGG TGATGATGTG ATTGTTGACG CTAAAGAGTG GTTATTAGAC 
CCAGATGATG TGTCCATTGA AACTCTTACA TCTGGACGCA ATAATACCGG CGAAAACCAA 
GGATATACAA CAGGAGATGG GACTAAAGAG TCACCTAAAG GTAATAGTAT TTCTAAACCT • 
ACATTAACAA ACTCAACTCT TGAGCAAATC CTAAGAAGAG GTTCTTATGT TAATATCACT 
GCTAATAATA GAATTTATGT TAATAGCTCC ATCAACTTAT CTAATGGCAG TTTAACACTT 
CACACTAAAC GAGATGGAGT TAAAATTAAC GGTGATATTA CCTCAAACGA AAATGGTAAT 
TTAACCATTA AAGCAGGCTC TTGGGTTGAT GTTCATAAAA ACATCACGCT TGGTACGGGT 
TTTTTGAATA TTGTCGCTGG GGATTCTGTA GCTTTTGAGA GAGAGGGCGA TAAAGCACGT 
AACGCAACAG ATGCTCAAAT TACCGCACAA GGGACGATAA CCGTCAATAA AGATGATAAA 
CAATTTAGAT TCAATAATGT ATCTATTAAC GGGACGGGCA AGGGTTTAAA GTTTATTGCA 
AATCAAAATA ATTTCACTCA TAAATTTGAT GGCGAAATTA ACATATCTGG AATAGTAACA 
ATTAACCAAA CCACGAAAAA AGATGTTAAA TACTGGAATG CATCAAAAGA CTCTTACTGG 
AATGTTTCTT CTCTTACTTT GAATACGGTG CAAAAATTTA CCTTTATAAA ATTCGTTGAT 
AGCGGCTCAA ATTCCCAAGA TTTGAGGTCA TCACGTAGAA GTTTTGCAGG CGTACATTTT 
AACGGCATCG GAGGCAAAAC AAACTTCAAC ATCGGAGCTA ACGCAAAAGC CTTATTTAAA 
TTAAAACCAA ACGCCGCTAC AGACCCAAAA AAAGAATTAC CTATTACTTT TAACGCCAAC 
ATTACAGCTA CCGGTAACAG TGATAGCTCT GTGATGTTTG ACATACACGC CAATCTTACC 
TCTAGAGCTG CCGGCATAAA CATGGATTCA ATTAACATTA CCGGCGGGCT TGACTTTTCC 
ATAACATCCC ATAATCGCAA TAGTAATGCT TTTGAAATCA AAAAAGACTT AACTATAAAT 
GCAACTGGCT CGAATTTTAG TCTTAAGCAA ACGAAAGATT CTTTTTATAA TCAATACAGC 
AAACACGCCA TTAACTCAAG TCATAATCTA ACCATTCTTG GCGGCAATGT CACTCTAGGT 
GGGGAAAATT CAAGCAGTAG CATTACGGGC AATATCAATA TCACCAATAA AGCAAATGTT " 
ACATTACAAG CTGACACCAG CAACAGCAAC ACAGGCTTGA AGAAAAGAAC TCTAACTCTT 
GGCAATATAT CTGTTGAGGG GAATTTAAGC CTAACTGGTG CAAATCCAAA CATTGTCGGC 
AATCTTTCTA TTGCAGAAGA TTCCACATTT AAAGGAGAAG CCAGTGACAA CCTAAACATC 
ACCGGCACCT TTACCAACAA CGGTACCGCC AACATTAATA TAAAACAAGG AGTCGTAAAA 
CTCCAAGGCG ATATTATCAA TAAAGGTGGT TTAAATATCA CTACTAACGC CTCAGGCACT 
CAAAAAACCA tTATTAACGG AAATATAACT AACGAAAAAG GCXSACTTAAA CATCAAGAAT 
ATTAAAGCCG AC^CCGAAAT CCAAATTGGC GGCAATATCT CACAAAAAGA AGGCAATCTC 
ACAATTTCTT CTGATAAAGT AAATATTACC AATCAGATAA CAATCAAAGC AGGCGTTCAA 
GGGGGGCGTT CTGATTCAAQ TGAGGCAGAA AATGCTAACC TAACTATTCA AACCAAAGAG 
TTAAAATTGG CAGGAGACCT AAATATTTCA GGCTTTAATA AAGCAGAAAT TACAGCTAAA 
AATGGCAGTG ATTTAACTAT TGGCAATGCT AGCGGTCGTA ATCCTCATCC TAAAAAAGTC 
ACTTTTGACA AGGTTAAAGA TTCAAAAATC TCGACTGACX5 GTCACAATGT AACACTAAAT 



1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2S20 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 
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AGCGAAGTGA 


. AAACGTCTAA 


. TGGTAGTAGC 


AATG<: rGGTA 


ATGATAACAG 


CACCGGTTTA 


3360 


ACCATTTCCG 


CAAAAGATGT 


AACGGTAAAC 


aata/,..:gtta 


CCTCCCACAA 


GACAATAAAT 


• 3420 


ATCTCTGCCG 


CAGCAGGAAA 


TGTAACAACC 


AAAGAAGG^IA 


CAACTATCAA 


TGCAACCACA 


3480 


GGCAGCGTGG 


aagt;^j^ctgc 


TCAAAATGGT 


ACAA'-TAAAG 


GCAACATTAC 


CTCGCAAAAT 


3540 


GTAACAGTGA 


CAGCAACAGA 


AAATCTTGTT 


ACCACAGAGA 


ATGCTGTCAT 


TAATGCAACC 


3600 


AGCGGCACAG 


TAAACATTAG 


TACAAAAACA 


GGGGATATTA 


AAGGTGGAAT 


TGAATCAACT 


3660 


TCCGGTAATG 


TAAATATTAC 


AGCGAGCGGC 


AATACACTTA 


AGGTAZ^GTAA 


TATCACTGGT 


3720 


CAAGATGTAA 


CAGTAACAGC 


GGATGCAGGA 


GCCTTGACAA 


CTACAGCAGG 


CTCAACCATT 


3780 


AGTGCGACAA 


CAGGCAATGC 


AAATATTACA 


ACCAAAACAG 


GTGATATCAA 


CGGTAAAGTT 


3840 


GAATCCAGCT 


CCGGCTCTGT 


AACACTTGTT 


GCAACTGGAG 


CAACTCTTGC 


TGTAGGTAAT 


3900 


ATTTCAGGTA 


ACACTGTTAC 


TATTACTGCG 


GATAGCGGTA 


AATTAACCTC 


CACAGTAGGT 


3960 


TCTACAATTA 


ATGGGACTAA 


TAGTGTAACC 


ACCTCAAGCC 


AATCAGGCGA 


TATTGAAGGT 


4020 


ACAATTTCTG 


GTAATACAGT 


AAATGTTACA 


GCAAGCACTG 


GTGATTTAAC 


TATTGGAAAT 


4080 


AGTGCAAAAG 


TTGAAGCGAA 


AAATGGAGCT 


GCAACCTTAA 


CTGCTGAATC 


AGGCAAATTA 


4140 


ACCACCCAAA 


CAGGCTCTAG 


CATTACCTCA 


AGCAATGGTC 


AGACAACTCT 


TACAGCCAAG 


4200 


GATAGCAGTA 


TCGCAGGAAA 


CATTAATGCT 


GCTAATGTGA 


CGTTAAATAC 


CACAGGCACT 


4260 


TTAACTACTA 


CAGGGGATTC 


AAAGATTAAC 


GCAACCAGTG 


GTACCTTAAC 


AATCAATGCA 


4320 


AAAGATGCCA 


AATTAGATGG 


TGCTGCATCA 


GGTGACCGCA 


CAGTAGTAAA 


TGCAACTAAC 


4380 


GCAAGTGGCT 


CTGGTAACGT 


GACTGCGAAA 


ACCTCAAGCA 


GCGTGAATAT 


CACCGGGGAT 


4440 


TTAAACACAA 


TAAATGGGTT 


AAATATCATT 


TCGGAAAATG 


GTAGAAACAC 


TGTGCGCTTA 


4500 


AGAGGCAAGG 


AAATTGATGT 


GAAATATATC 


CAACCAGGTG 


TAGCAAGCGT 


AGAAGAGGTA 


- 4560 


ATTGAAGCGA 


AACGCGTCCT 


TGAGAAGGTA 


AAAGATTTAT 


CTGATGAAGA 


AAGAGAAACA 


4620 


CTAGCCAAAC 


TTGGTGTAAG 


TGCTGTACGT 


TTCGTTGAGC 


CAAATAATGC 


CATTACGGTT 


4680 


AATAQACAAA 


ACGAGTTTAC 


AACCAAACCA 


TCAAGTCAAG 


TGACAATTTC 


TGAAGGTAAG 


4740 


GCGTGTTTCT 


CAAGTGGTAA 


TGGCGCACGA GTATGTACCA ATGTTGCTGA CGATGGACAG 


4800 


CAG 












4803 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1599 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

Met Asn Lys lie Tyr Arg Leu Lys Phe Ser Lvs Arg Leu Asn Ala Leu 
1 5 10 * 15 

Val Ala Val Ser Glu Leu Thr Arg Gly Cys Asp His Ser Thr Glu Lys 
20 25 ^ ' 30 

Gly Ser Glu Lys Pro Val Arg Thr Lys Val Arc His Leu Ala Leu Lys 
35 40 45 

Pro Leu Ser Ala lie Leu Leu Ser Leu Gly Met Ala Ser lie Pro Gin 
SO 55 60 

Ser Val Leu Ala Ser Gly Leu Gin Gly Met Ser Val Val His Gly Thr 
65 70 75 80 

Ala Thr Met Gin Val Asp Gly Asn Lys Thr Thr lie Arg Asn Ser Val 
85 90 95 

Asn Ala lie lie Asn Trp Lys Gin Phe Asn lie Asp Gin Asn Glu Met 
100 105 110 

Glu Gin Phe Leu Gin Glu Ser Ser Asn Ser Ala Val Phe Asn Arg Val 
115 120 125 

Thr Ser Asp Gin lie Ser Gin Leu Lys Gly lie Leu Asp Ser Asn Gly 
130 135 140 

Gin Val Phe Leu He Asn Pro Asn Gly He Thr He Gly Lys Asp Ala 
145 150 155 160 

He He Asn Thr Asn Gly Phe Thr Ala Ser Thr Leu Asp He Ser Asn 
165 170 ' 175 

Glu Asn He Lys Ala Arg Asn Phe Thr Leu Glu Gin Thr Lys Asp Lys 
180 185 190 

Ala Leu Ala Glu He Val Asn His Gly Leu He Thr Val Gly Lys Asp 
195 200 205 

Gly Ser Val Asn Leu He Gly Gly Lys Val Lys Asn Glu Gly Val He 
210 215 220 

Ser Val Asn Gly Gly Ser He Ser Leu Leu Ala Gly Gin Lys He Thr 
225 230 235 240 

He Ser Asp He He Asn Pro Thr He Thr Tyr Ser He Ala Ala Pro 
245 250 255 

Glu Asn Glu Ala He Asn Leu Gly Asp He Phe Ala Lys Gly Gly Asn 
260 265 270 

He Asn Val Arg Ala Ala Thr He Arg Asn Lys Gly Lys Leu Ser Ala 
275 280 285 

Asp Ser Val Ser Lys Asp Lys Ser Gly Asn He Val Leu Ser Ala Lys 
290 295 300 

Glu Gly Glu Ala Glu He Gly Gly Val He Ser Ala Gin Asn Gin Gin 
305 310 315 320 

Ala Lvs Gly Gly Lys Leu Met He Thr Gly Asp Lys Val Thr Leu Lys 
325 330 335 
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Thr Gly Ala Val He Asp Leu Ser Gly Lys Glu Gly Gly Glu Thr Tyr 

340 345 350 

Leu Gly Gly Asp Glu Arg Gly Glu Gly Lys Asn Gly He Gin Leu Ala 
355 360 365 

Lys Lys Thr Thr Leu Glu Lys Gly Ser Thr He Asn Val Ser Gly Lys 
370 375 380 

Glu Lys Gly Gly Arg Ala lie Val Trp Gly Asd lie Ala Leu lie Asp 
385 390 395 400 

Gly Asn He Asn Ala Gin Gly Lys Asp He Ala Lys Thr Gly Gly Phe 
405 410 415 

Val Glu Thr Ser Gly His Tyr Leu Ser He Aso Asp Asn Ala He Val 
420 425 ' 430 

Lys Thr Lys Glu Trp Leu Leu Asp Pro Glu Asn Val Thr He Glu Ala 
435 440 445 

Pro Ser Ala Ser Arg Val Glu Leu Gly Ala Asp Arg Asn Ser His Ser 
450 455 ^ 460 

Ala Glu Val He Lys Val Thr Leu Lys Lys Asn Asn Thr Ser Leu Thr 
465 470 475 480 

Thr Leu Thr Asn Thr Thr He Ser Asn Leu Leu Lys Ser Ala His Val 
485 490 495 

Val Asn He Thr Ala Arg Arg Lys Leu Thr Val Asn Ser Ser He Ser 
500 505 510 

He Glu Arg Gly Ser His Leu He Leu His Ser Glu Gly Gin Gly Gly 
515 520 525 

Gin Gly Val Gin He Asp Lys Asp He Thr Ser Glu Gly Gly Asn Leu 
530 535 540 

Thr He Tyr Ser Gly Gly Trp Val Asp Val His Lys Asn He Thr Leu 
545 550 555 560 

Gly Ser Gly Phe Leu Asn He Thr Thr Lys Glu Gly Asp He Ala Phe 

565 570 575 

Glu Asp Lys Ser Gly Arg Asn Asn Leu Thr He Thr Ala Gin Gly Thr 
580 ^ 585 590 

He Thr Ser Gly Asn Ser Asn Gly Phe Arg Phe Asn Asn Val Ser Leu 
595 600 605 

Asn Ser Leu Gly Gly Lys Leu Ser Phe Thr Asp Ser Arg Glu Asp Arg 
610 615 620 

Gly Arg Arg Thr Lys Gly Asn He Ser Asn Lys Phe Asp Gly Thr Leu 
625 630 635 640 

Asn He Ser Gly Thr Val Asp He* Ser Met Lys Ala Pro Lys Val Ser 
645 650 655 

Trp Phe Tyr Arg Asp Lys Gly Arg Thr Tyr Trp Asn Val Thr Thr Leu 
660 665 670 

Asn Val Thr Ser Gly Ser Lys Phe Asn Leu Ser He Asp Ser Thr Gly 
675 680 685 



95 

Ser Gly Ser Thr Gly Pro Ser He Are Asn Ala Glu Leu Asn Gly He 
-o^" 695 700 

Thr Phe Asn Lys Ala Thr Phe Asn He Ala Gin Gly Ser Thr Ala Asn 

710 ^20 

Phe ser He Lys Ala Ser He Met Pro Phe Lys Ser Asn Ala Asn Tyr 
725 730 

Ala Leu Phe Asn Glu Asp He Ser Val Ser Gly Gly Gly Ser Val Asn 

740 

Phe Lys Leu Asn Ala Ser Ser Ser Asn He Gin Thr Pro Gly Val He 
755 760 765 

He Lys Ser Gin Asn Phe Asn Val Ser Gly Gly Ser Thr Leu Asn Leu 

775 780 

Lys Ala Glu Gly Ser Thr Glu Thr Ala Phe Ser He Glu Asn Asp Leu 

790 795 800 

Asn Leu Asn Ala Thr Gly Gly Asn He Thr He Arg Gin Val Glu Gly 
805 810 815 

Thr Asp Ser Arg Val Asn Lys Gly Val Ala Ala Lys Lys Asn He Thr 
820 825 830 

Phe Lys Gly Gly Asn He Thr Phe Gly Ser Gin Lys Ala Thr Thr Glu 
835 840 845 

He Lys Gly Asn Val Thr He Asn Lys Asn Thr Asn Ala Thr Leu Arg 
850 855 860 

Gly Ala Asn Phe Ala Glu Asn Lys Ser Pro Leu Asn He Ala Gly Asn 

870 875 880 

Val He Asn Asn Gly Asn Leu Thr Thr Ala Gly Ser He He Asn He 
885 890 895 

Ala Gly Asn Leu Thr Val Ser Lys Gly Ala Asn Leu Gin Ala He Thr 
900 905 910 

Asn Ti^r Thr Phe Asn Val Ala Gly Ser Phe Asp Asn Asn Gly Ala Ser 
915 920 925 

, Asn He ser He Ala Arg Gly Gly Ala Lys Phe Lys Asp He Asn Asn 
930 935 940 

Thr Ser Ser Leu A&n He Thr Thr Asn Ser Asp Thr Thr Tyr Ara Thr 
^'^S 950 ^ ^ 

He He Lys Gly Asn He Ser Asn Lys Ser Gly Asp I^u Asn He He 
96S 970 975 

Asp Lys Lys Ser Asp Ala Glu He Gin He Gly Gly Asn He Ser Gin 
980 98S 990 

Lys Glu Gly Asn Leu Thr He Ser Ser Asp Lys Val Asn He Thr Asn 
995 1000 1005 

Gin He Thr He Lys Ala Gly Val Glu Gly Gly Arg Ser Asp Ser Ser 
1010 1015 1020 

Glu Ala Glu Asn Ala Asn Leu Thr He Gin Thr Lys Glu Leu Lys Leu 
1025 1030 1035 1040 



96 



Ala Gly Asp Leu Asn He Ser Gly Phe Asn Lys Ala Glu He Thr Ala 



1050 loss 



Lys Asn Gly Ser Asp Leu Thr He Gly Asn Ala Ser Gly Gly Asn Ala 

1065 1070 

ASP Ala Lys Lys Val Thr Phe Asp Lys Val Lys Asp Ser Lys He Ser 

1080 1085 

^olo''^'' "'^ ""^^ r^^5^^" ^-1" Thr ser Asn 



1100 



Gly ser Ser Asn Ala Gly Asn Asp Asn Ser Thr Gly Leu Thr He Ser 

1110 Ills ^^20 



Ala Lys Asp Val Thr Val Asn Asn Asn Val Thr Ser His Lvs Thr He 
112S 1130 ' 1135 

Asn He Ser Ala Ala Ala Gly Asn Val Thr Thr Lys Glu Gly Thr Thr 

114S 1150 

He Asn Ala Thr Thr Gly Ser Val Glu Val Thr Ala Gin Asn Gly Thr 
1155 1160 1165 ^ ^'"'^ 

iHo^^^ ^^"^ ^^"^ ''^^ "^hr Val Thr Ala Thr Glu 

11'75 1180 

Asn^Leu val Thr Thr Glu^Asn Ala Val He Asn Ala Thr Ser Gly Thr 

val Asn He Ser Thr Lys Thr Gly Asp He Lys Gly Gly He Glu Ser 
1205 1210 1215 

Thr ser Gly Asn Val Asn He Thr Ala Ser Gly Asn Thr Leu Lys Val 
1220 1225 1230 

Ser Asn lie Thr Gly Gin Asp Val Thr Val Thr Ala Asp Ala Gly Ala 
1235 1240 124S ^ 

Leu Thr Thr Thr Ala Gly Ser Thr He Ser Ala Thr Thr Gly Asn Al-a 

1255 1260 

Asn He Thr Thr Lys Thr Gly Asp He Asn Gly Lys Val Glu Ser Ser 
^2^^ 1270 1275 1280 

Ser Gly Ser Val Thr Leu Val Ala Thr Gly Ala Thr Leu Ala Val Gly 
1285 1290 1295 

Asn He Ser Gly Asn Thr Val Thr He Thr Ala Asp Ser Gly Lys Leu 
1300 1305 1310 

Thr Ser Thr Val Gly Ser Thr He Asn Gly Thr Asn Ser Val Thr Thr 
1315 1320 1325 

^^"^ f ??o^^" ^^"^ Ser Gly Asn Thr Val 

•••JJ" 1335 1340 

Asn val Thr Ala Ser Thr Gly Asp Leu Thr He Gly Asn Ser Ala Lys 
1345 1350 1355 ^^^^^ 

Val Glu Ala Lys Asn Gly Ala Ala Thr Leu Thr Ala Glu Ser Gly Lys 
1365 1370 1375 

Leu Thr Thr Gin Thr Gly Ser Ser He Thr Ser Ser Asn Gly Gin Thr 
1380 138S 1390 



97 

Thr Leu Thr^Ala Lys Asp Ser Servile Ala Gly Asn Ile^Asn Ala Ala 

Asn Val^Thr Leu Asn Thr Thr Gly Thr Leu Thr Thr Thr Gly Asp Ser 

^^^^ 1420 

Lys lie Asn Ala Thr Ser Gly Thr Leu Thr He Asn Ala Lys Asp Ala 

1435 144Q 

Lys Leu Asp Gly Ala Ala Ser Gly Asp Arg Thr Val Val Asn Ala Thr 

14S0 1455 

Asn Ala Ser Gly Ser Gly Asn Val Thr Ala Lys Thr Ser Ser Ser Val 
1460 1465 1470 

Asn He Thr Gly Asp Leu Asn Thr He Asn Gly Leu Asn He He Ser 
^ 1480 1485 

""^^ Jlso''^'' """^ ""^^ y^L'^^^ Gly Lys Glu He Asp Val 

^^^^ 1500 

Lys Tyr He Gin Pro Gly Val Ala Ser Val Glu Glu Val He Glu Ala 

^^■^^ 1520 

Lys Arg Val Leu Glu Lys Val Lys Asp Leu Ser Asp Glu Glu Arg Glu 
1525 1530 ^^l^ 

Thr Leu Ala Lys^Leu Gly Val Ser Ala^Val Arg Phe Val Glu^Pro Asn 

Asn Ala lie Thr Val Asn Thr Gin Asn Glu Phe Thr Thr Lys Pro Ser 
15S5 1560 ■• 3^5g3 

?S7o''^^ """-^ ^he ser Ser Gly Asn 

1575 1580 

Gly Ala Arg Val Cys Thr Asn Val Ala Asp Asp Gly Gin Gin Pro 

-L^o^ 1590 1595 

(2) INFORMATION FOR SEQ ID NO : 10 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1600 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Met Asn Lys He Tyr Arg Leu Lys Phe Ser Lys Arg Leu Asn Ala Leu 



10 



IS 



Val Ala val Ser Glu Leu Thr Arg Gly Cys Asp His Ser Thr Glu Lys 
20 25 30 

Gly Ser Glu Lys Pro Val Arg Thr Lys Val Arg His Leu Ala Leu Lys 

35 40 45 ' 

Pro Leu Ser Ala He Leu Leu Ser Leu Gly Met Ala Ser He Pro Gin 

SS 60 

Ser Val Leu Ala Ser Gly Leu Gin Gly Met Ser Val Val His Gly Thr 

75 80 




98 



ox„ v.x ..p 

.1. XU XX, 2 
=1. =1. Ph, OX. .x„ s„ s s„ .X. „.x 1° 

125 

Thr Ser Asp Gin He q«a>- r-i t 

P S,r GX„ Leu Lys Oly XX. Leu A,p s,r Aa„ Gly 

jxn V.X ... ^^^^ 

Xla XX. ^^^^ 2° 

xxe .X. ^„ ^„ ^ . 

..u .X. oxu XX. V.X „x. ,x. XX. ™. V.X cx'° 

=Xy s„ v,x z..„ XI. Oly .Xy x,ys VaX Lys J.„ c=Xy vax rx. 

Ijr V.X OXy OXy Sej XX. S.. L.u L.u O^Xy .X„ Ly. XX. XH. 

XI. S.. ..p XX. XX. 

OX. OXU jxa XX, L.U GXy xX. P.. .x. Ly. .Xy OX^y 

270 

XX. vax „. .H. XX. Ly. OXy Ly. L.u s., .X. 

S.J v,x s.. Ly. ..p . OXy »s„ xx. V.X s.. .X. Lys 

01. OXy OXU .1, .lu ^^^^^ 

AX. Lys OXy Oly Lys Leu „e. 11. ... jly .sp Lys v,l r>,r X.u Lyl 

Th. OXy vax Xle ..p Leu OXy Lys olu OXy OXy Olu Z ryr, 

..u Oly OXy .sp olu ^ oXy olu Oly Ly, dy ,Xe Ol^ Leu «a 

"^^^ 365 
X-ya Lys Thr Th. Lau Olu Lys Oly Se. Th. xle ^» vaX s.r OXy Lya 

«u Lys Oly Oly ^ „. „a V.X oly Jsp Xll Leu Xle x^sp 

400 

♦ Gly Asn He Asn Ala Gin Glv Ser- ti 

405 ^ ^^"^ Thr Gly Gly Phe 

415 

Val Glu Thr Ser Gly His Asp Leu Ser He Gl^ . 

420 ^ ^ ^If ^-^^ Asp Asp Val He Val 

430 



99 



Asp Ala Lys Glu Trp Leu Leu Asp Pro Asp Asp Val Ser He Glu Thr 
435 

Leu Thr Ser Gly Arg Asn Asn Thr Gly Glu Asn Gin Gly Tyr Thr Thr 

450 455 ^ - ^ 



460 



Gly Asp Gly Thr Lys Glu Ser Pro Lys Gly Asn Ser He Ser Lys Pro 

470 475 480 

Thr Leu Thr Asn Ser Thr Leu Glu Gin He Leu Arg Arg Gly Ser Tvr 

490 

Val Asn He Thr Ala Asn Asn Arg He Tyr Val Asn Ser Ser He Asn 
500 505 

Leu Ser Asn Gly Ser Leu Thr Leu His Thr Lys Arg Asp Gly Val Lys 
515 520 525 

He Asn Gly Asp He Thr Ser Asn Glu Asn Gly Asn Leu Thr He Lys 
530 535 

Ala Gly Ser Trp Val Asp Val His Lys Asn He Thr Leu Gly Thr Glv 

"0 555 560 

Phe Leu Asn He Val Ala Gly Asp Ser Val Ala Phe Glu Arg Glu Gly 

565 - 570 575 

Asp Lys Ala Arg Asn Ala Thr Asp Ala Gin He Thr Ala Gin Gly Thr 
580 585 590 

He Thr Val Asn Lys Asp Asp Lys Gin Phe Arg Phe Asn Asn Val Ser 
555 

Leu Asn Gly Thr Gly Lys Gly Leu Lys Phe He Ala Asn Gin Asn Asn 

615 620 

Phe Thr His Lys Phe Asp Gly Glu He Asn He Ser Gly He Val Thr 
^25 630 635 - 640 

He Asn Gin Thr Thr Lys Lys Asp Val Lys Tyr Trp Asn Xla Ser Lys 

650 655 

Asp Ser Tyr Trp Asn Val Ser Ser Leu Thr Leu Asn Thr Val Gin Lvs 

665 670 

Phe Thr Phe He Lys Phe Val Asp Ser Gly Ser Asn Gly Gin Asp Leu 
675 680 685 

Arg Ser Ser Arg Arg Ser Phe Ala Gly Val His Phe Asn Gly He Glv 
690 695 700 

Gly Lys Thr Asn Phe Asn He Gly Ala Asn Ala Lys Ala Leu Phe Lvs 
705 710 715 720 

Leu Lys Pro Asn Ala Ala Thr Asp Pro Lys Lys Glu Leu Pro He Thr 
725 730 735 

Phe Asn Ala Asn He Thr Ala Thr Gly Asn Ser Asp Ser Ser Val Met 
740 745 750 

Phe Asp He His Ala Asn Leu Thr Ser Arg Ala Ala Gly He Asn Met 
755 760 765 

Asp Ser He Asn He Thr Gly Gly Leu Asp Phe Ser He Thr Ser His 
770 775 780 



100 



Asn Arg Asn Ser Asn Ala Phe Glu He Lys Lys Asp Leu Thr He Asn 



795 



800 



Ala Thr Gly Ser Asn Phe Ser Leu Lys Gin Thr Lys Asp Ser Phe Tyr 

810 815 

Asn Glu Tyr Ser Lys His Ala He Asn Ser Ser His Asn Leu Thr He 

825 

Leu Gly Gly Asn Val Thr Leu Gly Gly Glu Asn Ser Ser Ser Ser He 



835 840 

Thr Gly Asn He Asn He Thr Asn Lys Ala Asn Val Thr Leu Gin Ala 

860 

Thr Gly Leu Lys Lys 

870 Q7c: 

880 

Gly Asn He Ser Val Glu Gly Asn Leu Ser Leu Thr Gly Ala Asn Ala 



ASP Thr ser Asn Ser Asn Thr Gly Leu Lys Lys Arg Thr Leu Thr Leu 



885 890 895 



Asn He Val Gly Asn Leu Ser He Ala Glu Asp Ser Thr Phe Lys Gly 

905 

Glu Ala ser Asp Asn Leu Asn He Thr Gly Thr Phe Thr Asn Asn Gly 

920 925 

Thr Ala Asn He Asn He Lys Gly Val Val Lys Leu Gly Asp He Asn 
Asn Lys Gly Gly Leu Asn He Thr Thr Asn Ala Ser Gly Thr Gin Lys 



960 



Thr He He Asn Gly Asn He Thr Asn Glu Lys Gly Asp Leu Asn He 

970 975 



Lys Asn He Lys Ala Asp Ala Glu He Gin He Gly Gly Asn He Ser 

985 990 

Gin Lys Glu Gly Asn Leu Thr He Ser Ser Asp Lys Val Asn He Thr 

1000 1005 

Asn Gin lie Thr He Lys Ala Gly Val Glu Gly Gly Arg Ser Asp Ser 

-^015 1020 

ser Glu Ala Glu Asn Ala Asn Leu Thr He Gin Thr Lys Glu Leu Lys 

1035 XQ40 
Leu Ala Gly Asp Leu Asn He Ser Gly Phe Asn Lys Ala Glu He Thr 
1045 loSO loss 

Ala Lys Asn JJy Ser Asp Leu Thr He Gly Asn Ala Ser Gly Gly Asn 

1065 1070 
Ala Asp Ala^Lys Lys Val Thr Jhe^Asp Lys Val Lys Asp Ser Lys He 

^^'^ lolo'^'' yniJ''^ Thr ser 

i095 1100 

Asn Gly Ser Ser Asn Ala Gly Asn Asp Asn Ser Thr Gly Leu Thr He 

1110 Ills 1120 

Ser Ala Lys Asp Val^Thr Val Asn Asn J^n^Val Thr Ser His Lys Thr 



113S 



101 



lie Asn lie Ser^Ala Ala Ala Oly Asn Val Thr Thr .ys Clu Oly Th. 
Thr Xle Asn^Ala Thr Thr Oly Se.^Val clu Val Thr Ala Cln Asn Cly 



116S 



Thr Ile^.ys Oly Asn He Thr Ser Oln Asn Val Thr Val Thr Ala Thr 

^^'^ 1180 
Clu^Asn Leu val Th. Thr^Olu Asn Ala Val xie^Asn Ala Thr Ser Oly 

Thr val Asn He Ser^Thr .ys Thr Oly Asp^xle .ys Oly Oly Xle^oiT 

ser Thr Ser Oly^Asn Val Asn Xle Thr Ala Ser Oly Asn Thr H^y. 

1-225 1230 
val ser Asn^Xle Thr Cly Cln Asp Val Thr Val Thr Ala Asp Ala Oly 

Ala X^eu^Thr Thr Thr Ala Oly Ser Thr Xle Ser Ala Thr Thr Cly Asn 

■^^^^ 1260 
Ala^Asn Xle Thr Thr I^ys^Thr Gly Asp Xle Asn^Oly Lys Val Olu Ser 

^ ser ser Oly Ser Val^Thr Leu Val Ala Thr^oiy Ala Thr Leu Ala vli^ 

Gly Asn Xle Ser^Oly Asn Thr Val Thr^lle Thr Ala Asp Ser^OiTLys 

Leu Thr Ser^Thr Val Oly Ser Thr He Asn Oly Thr Asn Ser Val Thr 

1320 3^325 

Thr Ser^ser Gin Ser Gly Asp Xle Glu Gly Thr Xle Ser Oly Asn Thr 

"35 1340 
Val^Asn val Thr Ala Ser^Thr Gly Asp Leu Thr Xle Gly Asn Ser Ala 

^^^5 1360 
Lys val Glu Ala Lys^Asn Gly Ala Ala Thr^Leu Thr Ala olu Ser Gly 

Lys Leu Thr Thr^Gln Thr Gly Ser Ser He Thr Ser Ser Asn 6l7oin 

1J85 ;l35q 

Thr Thr Leu^Thr Aia Lys Asp Ser Ser Xle Ala Gly Asn Xle Asn Ala 

Ala Asn val Thr Leu Asn Thr Thr Gly Thr Leu Thr Thr Thr Gly Asp 

-^^-^^ 1420 

Ser^Lys Xle Asn Ala Thr^Ser Gly Thr Leu Thr He Asn Ala Lys Asp 

1435 144Q 

Ala Lys Leu Asp Gly Ala Ala Ser Gly Asp Arg Thr Val Val Asn Ala 

14S0 1455 

Thr Asn Ala Ser Gly Ser Gly Asn Val Thr Ala Lys Thr Ser Ser Ser 

1465 1470 
val Asn He^Thr Gly Asp Leu Asn^Thr He Asn Gly Leu Asn He He 



1485 



102 



llto''" "'^ -'9 01y^.y, oiu XX. ..p 

V.X^.X. T.. XX. OX„ P„ .X. V.X s„ V.X IToXu „.x xxe cx„ 

AX. „.x ,.„^,x„ .y. „.X «p^,,„ 3„ ..p .X. a,.Z' 

OXu TH. X..U JX,^X.y. X..U OX,- V,X S«^.X. VaX .,3 PK, V.X^lT„„ 
Asn jx.^xle Th. v.i J, ,„ ^„ 

1565 

Ser Ser Gin Val Thr n r-i 

1570 ''''^ ?|7S^'" ""^^ -^^^ Phe Ser Ser Gly 

1580 

Asn Gly Ala Arg Val Cys Thr Asn w^i at . n 

1585 1590 Gin Gin Pro 

^^^^ 1600 

INFORMATION FOR SEQ ID NO: 11: 

fi) SEQUENCE CHARACTERISTICS - 

(A) LENGTH: 29 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : sincrle 

(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

V.X Ol. V.X XX. OX„ x.^.. ^ XX. X,eu oiu X.ys V.l ..p 

I^U Se. A.P Glu oXu Arg GXu Ala L.u Al. x.ys Leu Gly 



